Closed
Bug 1412394
Opened 7 years ago
Closed 7 years ago
near perma fail dom/media/tests/crashtests/1367930_2.html | crash
Categories
(Core :: WebRTC: Audio/Video, defect, P2)
Core
WebRTC: Audio/Video
Tracking
()
RESOLVED
FIXED
mozilla59
People
(Reporter: intermittent-bug-filer, Assigned: pehrsons)
References
Details
(Keywords: intermittent-failure, Whiteboard: [stockwell disabled] )
Crash Data
Attachments
(3 files)
59 bytes,
text/x-review-board-request
|
jib
:
review+
|
Details |
59 bytes,
text/x-review-board-request
|
jib
:
review+
gchang
:
approval-mozilla-beta-
|
Details |
59 bytes,
text/x-review-board-request
|
jib
:
review+
gchang
:
approval-mozilla-beta-
|
Details |
Filed by: dluca [at] mozilla.com
https://treeherder.mozilla.org/logviewer.html#?job_id=140152822&repo=mozilla-central
[object Object]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 3•7 years ago
|
||
In the last 2 days, since the bug was filed, there have been 159 failures.
We're seeing that most of the failures occur on these 3 platforms in almost equal numbers:
android-4-4-armv7-api16, android-7-1-armv8-api16, android-6-0-armv8-api16 and also there are few that occur on
android-5-0-armv8-api16. The failures occur on debug and opt builds.
The affected test suite is Autophone Crashtest DOM Media 2.
Here is an example of a recent log: https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=140544019&lineNumber=41
And a relevant snippet of the log:
2017-10-29 08:02:07,202 pixel-05 INFO <2017-10-29T15:02:07+00:00> pixel-05 (WORKING): Starting test crashtest-dom-media-tests
38
2017-10-29 08:02:07,204 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests logging to /mozilla/autophone/autophone/builds/aHR0cHM6Ly9xdWV1ZS50YXNrY2x1c3Rlci5uZXQvdjEvdGFzay9TX2JiRlN5V1I4LTdRdlo4ZTdINGJBL3J1bnMvMC9hcnRpZmFjdHMvcHVibGljL2J1aWxkL3RhcmdldC5hcGs=/tests/crashtest-dom-media-tests-crashtests-dom-media-tests-opt.ini-1-pixel-05.log
39
2017-10-29 08:02:07,204 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests cmdline = python -u reftest/remotereftest.py --suite=crashtest --ignore-window-size reftest/tests/dom/media/tests/crashtests/crashtests.list --deviceSerial=FA6A10319083 --remoteTestRoot=/sdcard/tests/autophone --app=org.mozilla.fennec_aurora --xre-path=/mozilla/autophone/downloads/firefox --utility-path=/mozilla/autophone/downloads/tests/bin --timeout=300 --remote-webserver=10.252.73.230 --http-port=47389 --ssl-port=46975 --total-chunks=1 --this-chunk=1 --pidfile=tmpjSwJSI.pid --remote-logfile=tmpWNEiGt.log --symbols-path=/mozilla/autophone/autophone/builds/aHR0cHM6Ly9xdWV1ZS50YXNrY2x1c3Rlci5uZXQvdjEvdGFzay9TX2JiRlN5V1I4LTdRdlo4ZTdINGJBL3J1bnMvMC9hcnRpZmFjdHMvcHVibGljL2J1aWxkL3RhcmdldC5hcGs=/symbols
40
2017-10-29 08:02:07,205 pixel-05 INFO <2017-10-29T15:02:07+00:00> pixel-05 (WORKING): Running test crashtest-dom-media-tests chunk 1 of 1
41
2017-10-29 08:03:11,940 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests TEST-UNEXPECTED-FAIL | autophone-crashtest-dom-media-tests | Test exited with return code 1
42
2017-10-29 08:03:11,940 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests runtestsremote.py return code 1
43
2017-10-29 08:03:11,940 pixel-05 INFO <2017-10-29T15:03:11+00:00> pixel-05 (WORKING): Completed test crashtest-dom-media-tests chunk 1 of 1
44
2017-10-29 08:03:11,941 pixel-05 INFO <2017-10-29T15:03:11+00:00> pixel-05 (WORKING): runtestsremote.py run_job exit
:jolin, could you please take a look?
Flags: needinfo?(jolin)
Whiteboard: [stockwell needswork]
Comment hidden (Intermittent Failures Robot) |
Comment 5•7 years ago
|
||
No sure if I read the log correctly, but it looks like the failure is caused by:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=140544019&lineNumber=696
That is, http://searchfox.org/mozilla-central/source/dom/media/MediaManager.cpp#2000
Munro, do you know what could go wrong?
Flags: needinfo?(jolin) → needinfo?(mchiang)
Component: Audio/Video → WebRTC: Audio/Video
Flags: needinfo?(mchiang)
Product: Firefox for Android → Core
call stack
2017-10-29 08:03:12,057 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 0 libxul.so!mozilla::MediaManager::PostTask [MediaManager.cpp:1e4f10cea5e7 : 2000 + 0x4]
2017-10-29 08:03:12,058 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 1 libxul.so!mozilla::SourceListener::StopTrack [MediaManager.cpp:1e4f10cea5e7 : 3708 + 0x3]
2017-10-29 08:03:12,059 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 2 libxul.so!mozilla::SourceListener::Stop [MediaManager.cpp:1e4f10cea5e7 : 3624 + 0x7]
2017-10-29 08:03:12,060 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 3 libxul.so!mozilla::SourceListener::NotifyFinished [MediaManager.cpp:1e4f10cea5e7 : 3858 + 0x5]
2017-10-29 08:03:12,061 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 4 libxul.so!mozilla::SourceListener::NotifyRemoved [MediaManager.cpp:1e4f10cea5e7 : 3870 + 0x5]
2017-10-29 08:03:12,062 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 5 libxul.so!mozilla::GetUserMediaWindowListener::~GetUserMediaWindowListener [MediaManager.cpp:1e4f10cea5e7 : 612 + 0x3]
2017-10-29 08:03:12,062 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 6 libxul.so!mozilla::GetUserMediaWindowListener::Release [MediaManager.cpp:1e4f10cea5e7 : 315 + 0x3]
2017-10-29 08:03:12,063 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 7 libxul.so!RefPtr<mozilla::GetUserMediaWindowListener>::~RefPtr [RefPtr.h:1e4f10cea5e7 : 41 + 0x3]
2017-10-29 08:03:12,064 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 8 libxul.so!PLDHashTable::~PLDHashTable [PLDHashTable.cpp:1e4f10cea5e7 : 325 + 0x3]
2017-10-29 08:03:12,065 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 9 libxul.so!PLDHashTable::ClearAndPrepareForLength [PLDHashTable.cpp:1e4f10cea5e7 : 340 + 0x3]
2017-10-29 08:03:12,066 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 10 libxul.so!mozilla::MediaManager::Shutdown [nsTHashtable.h:1e4f10cea5e7 : 277 + 0x3]
2017-10-29 08:03:12,066 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 11 libxul.so!<name omitted> [MediaManager.cpp:1e4f10cea5e7 : 1932 + 0x3]
2017-10-29 08:03:12,067 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 12 libxul.so!NS_InvokeByIndex [xptcinvoke_arm.cpp:1e4f10cea5e7 : 167 + 0x3]
2017-10-29 08:03:12,068 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 13 libxul.so!XPCWrappedNative::CallMethod [XPCWrappedNative.cpp:1e4f10cea5e7 : 1996 + 0x3]
2017-10-29 08:03:12,069 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 14 libxul.so!XPC_WN_CallMethod [XPCWrappedNativeJSOps.cpp:1e4f10cea5e7 : 929 + 0x5]
2017-10-29 08:03:12,070 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 15 libxul.so!js::InternalCallOrConstruct [jscntxtinlines.h:1e4f10cea5e7 : 291 + 0x3]
2017-10-29 08:03:12,071 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 16 libxul.so!<name omitted> [Interpreter.cpp:1e4f10cea5e7 : 521 + 0x9]
2017-10-29 08:03:12,072 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 17 libxul.so!Interpret [Interpreter.cpp:1e4f10cea5e7 : 527 + 0x7]
2017-10-29 08:03:12,072 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 18 libxul.so!js::RunScript [Interpreter.cpp:1e4f10cea5e7 : 422 + 0x7]
2017-10-29 08:03:12,073 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 19 libxul.so!js::InternalCallOrConstruct [Interpreter.cpp:1e4f10cea5e7 : 494 + 0x5]
2017-10-29 08:03:12,074 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 20 libxul.so!<name omitted> [Interpreter.cpp:1e4f10cea5e7 : 521 + 0x9]
2017-10-29 08:03:12,075 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 21 libxul.so!<name omitted> [Interpreter.cpp:1e4f10cea5e7 : 540 + 0x5]
2017-10-29 08:03:12,076 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 22 libxul.so!js::PromiseObject::create [Promise.cpp:1e4f10cea5e7 : 1666 + 0x7]
2017-10-29 08:03:12,077 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 23 libxul.so!PromiseConstructor [Promise.cpp:1e4f10cea5e7 : 1594 + 0x7]
2017-10-29 08:03:12,077 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 24 libxul.so!InternalConstruct [jscntxtinlines.h:1e4f10cea5e7 : 291 + 0x3]
2017-10-29 08:03:12,078 pixel-05 INFO UnitTestJob mozilla-inbound 20171029135920 opt api-25 android-api-16 crashtest-dom-media-tests 25 libxul.so!Interpret [Interpreter.cpp:1e4f10cea5e7 : 3058 + 0x7]
Not sure if this is related to bug 1403186
Assignee | ||
Comment 7•7 years ago
|
||
I think the MediaManager shutdown sequence looks off. We do this:
> sInShutdown = true;
> ...
> GetActiveWindows()->Clear(); // Tries to PostTask so goes BOOM because sInShutdown == true
> ...
> mMediaThread->message_loop()->PostTask(shutdown.forget());
IMO we should do:
> GetActiveWindows()->Clear();
> ...
> sInShutdown = true;
> mMediaThread->message_loop()->PostTask(shutdown.forget());
And long term we should get rid of `sInShutdown` altogether (MediaManager per document will handle that).
Jib, what do you think about the sequence?
Flags: needinfo?(jib)
Comment hidden (Intermittent Failures Robot) |
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a8a4e09cf09a
Disable dom/media/tests/crashtests/1367930_2.html on Android for frequent failures. r=me, a=testonly
Comment 10•7 years ago
|
||
I disabled this test for having >200 failures in the last 30 days, please remember to enable this when testing a fix
Keywords: leave-open
Summary: Intermittent autophone-crashtest-dom-media-tests | Test exited with return code 1 → near perma fail dom/media/tests/crashtests/1367930_2.html | crash
Whiteboard: [stockwell disable-recommended] → [stockwell disabled]
Assignee | ||
Comment 11•7 years ago
|
||
This is something we need to fix. I'm taking this so it doesn't get forgotten.
Assignee: nobody → apehrson
Rank: 14
Priority: P5 → P2
Comment 12•7 years ago
|
||
bugherder |
Comment hidden (Intermittent Failures Robot) |
Comment 14•7 years ago
|
||
(In reply to Andreas Pehrson [:pehrsons] from comment #7)
> Jib, what do you think about the sequence?
How would this be intermittent?
(recap of offline chat) Yeah switching the sequence helps, but can't other PostTasks still come in?
The MOZ_ASSERT was added by khuey last year, probably because he saw we leak the runnable (maybe we should have freed it instead? Most MediaManager runnables coatcheck things that must be freed on main, so freeing the runnable on the post thread should be ok).
Before the MOZ_ASSERT was added, setting sInShutdown as soon as we knew we were shutting down was mostly harmless (modulo the leak).
The original goal here was to prevent more tasks from being added to the media thread in shutdown (presumably from new JS calls on MediaManager). I think this just seemed like a reasonable thing to do, rather than a response to an observed problem FWIW.
I vote for dropping and freeing any new runnables. Seems like the smallest change. Randell, WDYT?
Flags: needinfo?(jib) → needinfo?(rjesup)
Assignee | ||
Comment 15•7 years ago
|
||
(In reply to Jan-Ivar Bruaroey [:jib] (needinfo? me) from comment #14)
> (In reply to Andreas Pehrson [:pehrsons] from comment #7)
> > Jib, what do you think about the sequence?
>
> How would this be intermittent?
If the window listeners were removed before we reach MediaManager shutdown (by OnNavigation), we're fine. Otherwise we're not. I can only assume there's a race between navigating on shutdown and notifying shutdown blockers.
> (recap of offline chat) Yeah switching the sequence helps, but can't other
> PostTasks still come in?
Are there actual sources still alive in shutdown that can PostTask()? IMO then we should make sure they're shut down before we kill the media thread.
Flags: needinfo?(jib)
Comment 16•7 years ago
|
||
I don't know about sources. This was written with the fear of general JS callbacks still happening and adding work for the media thread. I don't know for sure what the guarantees are, but we're in ProfileBeforeChange here, which I believe is pretty early.
Right now, the invariant offered here seems to be that ShutdownTask is guaranteed to be the last task on the media thread [1].
[1] http://searchfox.org/mozilla-central/rev/423b2522c48e1d654e30ffc337164d677f934ec3/dom/media/MediaManager.cpp#3126-3127
Flags: needinfo?(jib)
Assignee | ||
Comment 17•7 years ago
|
||
Looking in more detail I don't see any other way that shutting down hardware devices is meant to work. Looks to me like we either need to figure out why OnNavigation is happening *after* shutdown; or do something with the shutdown sequence and/or PostTask during shutdown strategy.
Flags: needinfo?(jib)
Assignee | ||
Comment 18•7 years ago
|
||
Here's a try run with the change I originally proposed: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5b83d52a8a5693960524aa0d87075c60d130f0fd
Comment 19•7 years ago
|
||
The attempt to disable the test failed and it has still been running. Due to unrelated changes to the gradle builds I've had to shut down autophone completely so no try jobs will be running.
Whiteboard: [stockwell disabled] → [stockwell disable-recommended]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 24•7 years ago
|
||
The try run in comment 18 didn't trigger autophone tests. Bob, can you help in getting it tested?
I've pushed it to review for now so we don't stall the fix on unrelated issues.
Flags: needinfo?(bob)
Comment 25•7 years ago
|
||
Autophone is down while I attempt to fix the bustage from the transition to gradle builds. I'm working on it now. Once we are back up and running, I'll manually submit the test for your try run. Sorry for the inconvenience.
Comment 26•7 years ago
|
||
I don't think dom/media/tests/crashtests/1367930_2.html is crashing. It looks more like dom/media/tests/crashtests/1367930_1.html.
Running a test on staging:
<https://treeherder.allizom.org/#/jobs?repo=autoland&exclusion_profile=false&group_state=expanded&fromchange=1b868efa368f0e8f2caa4df8568fca04f756a00e&filter-searchStr=autophone&selectedJob=135030760>
<https://treeherder.allizom.org/logviewer.html#?job_id=135030760&repo=autoland&lineNumber=16246> shows that 1367930_1.html seems to complete ok but it looks like when it loads a blank page afterwards, it crashes while the script is getting ready to skip 1367930_2.html. You can see from the setting parameters message which is in
1367930_1.html: console.log('setting parameters');
following down the load we see:
<data:audio/ogg;base64,T2dnUwACAAAAAAAAAAD+ML0RAAAAACtaCMoBHgF2b3JiaXMAAAAAAkSsAAAAAAAAAHECAAAAAAC4AU9nZ1MAAAAAAAAAAAAA/jC9EQEAAACZyDL8E/8a/////////////////////5EDdm9yYmlzLQAAAFhpcGguT3JnIGxpYlZvcmJpcyBJIDIwMTAxMTAxIChTY2hhdWZlbnVnZ2V0KRAAAAAbAAAARU5DT0RFUj1Ib29rRmxhc2hPZ2dFbmNvZGVyBwAAAFRJVExFPWYIAAAAVkVSU0lPTj0HAAAAQUxCVU09YQwAAABUUkFDS05VTUJFUj0IAAAAQVJUSVNUPWEKAAAAUEVSRk9STUVSPQoAAABDT1BZUklHSFQ9CAAAAExJQ0VOU0U9DQAAAE9SR0FOSVpBVElPTj0MAAAAREVTQ1JJUFRJT049BgAAAEdFTlJFPQUAAABEQVRFPQoAAABMT0NBVElPTj1mCAAAAENPTlRBQ1Q9BQAAAElTUkM9AQV2b3JiaXMpQkNWAQAIAAAAMUwgxYDQkFUAABAAAGAkKQ6TZkkppZShKHmYlEhJKaWUxTCJmJSJxRhjjDHGGGOMMcYYY4wgNGQVAAAEAIAoCY6j5klqzjlnGCeOcqA5aU44pyAHilHgOQnC9SZjbqa0pmtuziklCA1ZBQAAAgBASCGFFFJIIYUUYoghhhhiiCGHHHLIIaeccgoqqKCCCjLIIINMMumkk0466aijjjrqKLTQQgsttNJKTDHVVmOuvQZdfHPOOeecc84555xzzglCQ1YBACAAAARCBhlkEEIIIYUUUogppphyCjLIgNCQVQAAIACAAAAAAEeRFEmxFMuxHM3RJE/yLFETNdEzRVNUTVVVVVV1XVd2Zdd2ddd2fVmYhVu4fVm4hVvYhV33hWEYhmEYhmEYhmH4fd/3fd/3fSA0ZBUAIAEAoCM5luMpoiIa>
Hopefully my tests will be fine and I will be able to re-enable Autophone soon and we can test your try build.
which is "No video with supported format a MIME"
followed by <https://treeherder.allizom.org/logviewer.html#?job_id=135030760&repo=autoland&lineNumber=16617>
crashtests/1367930
WIN DEATH: Window{5cc6a28 u0 org.mozilla.fennec_aurora/org.mozilla.gecko.BrowserApp}
Comment 27•7 years ago
|
||
I submitted the try job to autophone and reported the results to staging.
It is not complete yet but so far have a different Cdm2 crash: crashed [@ Checker::StartReadOp]
https://treeherder.allizom.org/#/jobs?repo=try&revision=5b83d52a8a5693960524aa0d87075c60d130f0fd&filter-searchStr=autophone&group_state=expanded&selectedJob=135130685
https://treeherder.allizom.org/logviewer.html#?job_id=135130685&repo=try&lineNumber=732
It looks like this time it actually did run 1367930_2.html
Crash Signature: [@ mozilla::MediaManager::PostTask]
Flags: needinfo?(bob)
Comment 28•7 years ago
|
||
Note Cdm2 was disabled in Bug 1414059. It is still available on try.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 31•7 years ago
|
||
Here's another try that didn't run autophone: https://treeherder.mozilla.org/#/jobs?repo=try&revision=eab36d799281
Bob, Is there still bustage or are the instructions wrong? https://wiki.mozilla.org/EngineeringProductivity/Autophone#Manual_try_commit_message
Can you again help me push the latest patch?
Flags: needinfo?(bob)
Comment 32•7 years ago
|
||
Your syntax is correct. I'll file a bug and investigate. Submitted the test to your job.
Flags: needinfo?(bob)
Comment 33•7 years ago
|
||
(In reply to Andreas Pehrson [:pehrsons] from comment #15)
> Are there actual sources still alive in shutdown that can PostTask()? IMO
> then we should make sure they're shut down before we kill the media thread.
I don't know, and I agree. When shutdown-order bugs cause objects to suddenly outlive others they weren't supposed to outlive, they manifest as reference bugs. Finding the cause of those order bug requires a lot of context. A simpler reactive fix is to extend the lifestime of said objects, burying the order bug, with no guarantee this was the last side-effect of it, and we've lost the determinism we needed to understand the context. Deterministic shutdown is hard.
Flags: needinfo?(jib)
Comment 34•7 years ago
|
||
"Disabled" as per comment 28. No failures since Nov 2.
Whiteboard: [stockwell disable-recommended] → [stockwell disabled]
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 38•7 years ago
|
||
Assignee | ||
Updated•7 years ago
|
Status: NEW → ASSIGNED
Comment 39•7 years ago
|
||
mozreview-review |
Comment on attachment 8924878 [details]
Bug 1412394 - Re-enable the crashing test.
https://reviewboard.mozilla.org/r/196144/#review203978
Attachment #8924878 -
Flags: review?(jib) → review+
Comment 40•7 years ago
|
||
mozreview-review |
Comment on attachment 8924879 [details]
Bug 1412394 - Delay blocking MediaManager::PostTask until media thread shutdown.
https://reviewboard.mozilla.org/r/196146/#review203982
Lgtm.
::: commit-message-ef883:4
(Diff revision 2)
> +Bug 1412394 - Delay blocking MediaManager::PostTask until media thread shutdown. r?jib
> +
> +This delays setting sInShutdown (renamed to sHasShutdown for better semantics)
> +until when all that remains of shutdown is to shutdown the media thread.
s/when//
::: commit-message-ef883:8
(Diff revision 2)
> +The difference is that we let members and still-active listeners get destroyed
> +and shutdown any members (in this case devices) that are media-thread only,
Needs a comma or rephrase I think. Shutdown is a noun.
Attachment #8924879 -
Flags: review?(jib) → review+
Comment 41•7 years ago
|
||
mozreview-review |
Comment on attachment 8925523 [details]
Bug 1412394 - Remove window listeners explicitly during shutdown.
https://reviewboard.mozilla.org/r/196650/#review203984
::: dom/media/MediaManager.cpp:607
(Diff revision 2)
> - for (auto& l : mInactiveListeners) {
> - l->NotifyRemoved();
> - }
> - mInactiveListeners.Clear();
> + MOZ_ASSERT(mInactiveListeners.Length() == 0,
> + "Inactive listeners should already be removed");
> + MOZ_ASSERT(mActiveListeners.Length() == 0,
> + "Active listeners should already be removed");
> - for (auto& l : mActiveListeners) {
> - l->NotifyRemoved();
> - }
> - mActiveListeners.Clear();
This defers calling NotifyRemoved() until shutdown.
But what about whenever we remove an entry from mActiveWindows [1] like when we close a tab or navigate away? Did we not need to call NotifyRemoved() then? Won't we be accumulating active listeners over time now?
[1] https://searchfox.org/mozilla-central/rev/7fb4cc447c06f14fe3b5c6b0c9d103a860937250/dom/media/MediaManager.cpp#2988
::: dom/media/MediaManager.cpp:3065
(Diff revision 2)
> + {
> - // Close off any remaining active windows.
> + // Close off any remaining active windows.
> +
> + // Live capture at this point is rare but can happen. Stopping it will make
> + // the window listeners attempt to remove themselves from the active windows
> + // table. We cannot be touching the table at point so we grab a copy of the
"We cannot touch the table at that point"
Attachment #8925523 -
Flags: review?(jib) → review-
Assignee | ||
Comment 42•7 years ago
|
||
mozreview-review-reply |
Comment on attachment 8925523 [details]
Bug 1412394 - Remove window listeners explicitly during shutdown.
https://reviewboard.mozilla.org/r/196650/#review203984
> This defers calling NotifyRemoved() until shutdown.
>
> But what about whenever we remove an entry from mActiveWindows [1] like when we close a tab or navigate away? Did we not need to call NotifyRemoved() then? Won't we be accumulating active listeners over time now?
>
> [1] https://searchfox.org/mozilla-central/rev/7fb4cc447c06f14fe3b5c6b0c9d103a860937250/dom/media/MediaManager.cpp#2988
I think NotifyRemoved() was never intended to be called when the window listener removed the source listener, but rather when the source listener was removed from the MSG (and that's only relevant if it happens before the source listener is cleaned up by other means), per [1].
The regular cleanup path is SourceListener::Remove(), see [2].
With this in mind, this use of NotifyRemoved() was merely a fallback and making it a programming error seems fine and even a semantical improvement IMO.
[1] https://searchfox.org/mozilla-central/rev/7fb4cc447c06f14fe3b5c6b0c9d103a860937250/dom/media/MediaManager.cpp#234
[2] https://searchfox.org/mozilla-central/rev/7fb4cc447c06f14fe3b5c6b0c9d103a860937250/dom/media/MediaManager.cpp#188
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 45•7 years ago
|
||
mozreview-review |
Comment on attachment 8925523 [details]
Bug 1412394 - Remove window listeners explicitly during shutdown.
https://reviewboard.mozilla.org/r/196650/#review204144
Attachment #8925523 -
Flags: review?(jib) → review+
Comment 46•7 years ago
|
||
mozreview-review-reply |
Comment on attachment 8925523 [details]
Bug 1412394 - Remove window listeners explicitly during shutdown.
https://reviewboard.mozilla.org/r/196650/#review203984
> I think NotifyRemoved() was never intended to be called when the window listener removed the source listener, but rather when the source listener was removed from the MSG (and that's only relevant if it happens before the source listener is cleaned up by other means), per [1].
>
> The regular cleanup path is SourceListener::Remove(), see [2].
>
> With this in mind, this use of NotifyRemoved() was merely a fallback and making it a programming error seems fine and even a semantical improvement IMO.
>
> [1] https://searchfox.org/mozilla-central/rev/7fb4cc447c06f14fe3b5c6b0c9d103a860937250/dom/media/MediaManager.cpp#234
> [2] https://searchfox.org/mozilla-central/rev/7fb4cc447c06f14fe3b5c6b0c9d103a860937250/dom/media/MediaManager.cpp#188
Ok, we have asserts in place, so that's good.
Comment 47•7 years ago
|
||
[Tracking Requested - why for this release]: Shutdown crash. Tracking flags from bug 1414422.
status-firefox57:
--- → wontfix
status-firefox58:
--- → affected
status-firefox59:
--- → affected
status-firefox-esr52:
--- → ?
tracking-firefox58:
--- → ?
tracking-firefox59:
--- → ?
Comment hidden (Intermittent Failures Robot) |
Comment 50•7 years ago
|
||
We're sorry, Autoland could not rebase your commits for you automatically. Please manually rebase your commits and try again.
hg error in cmd: hg rebase -s dfe4b02bafb6 -d f345598ccaf9: rebasing 434829:dfe4b02bafb6 "Bug 1412394 - Re-enable the crashing test. r=jib"
rebasing 434830:520975d32334 "Bug 1412394 - Delay blocking MediaManager::PostTask until media thread shutdown. r=jib"
merging dom/media/MediaManager.cpp
warning: conflicts while merging dom/media/MediaManager.cpp! (edit, then use 'hg resolve --mark')
unresolved conflicts (see hg resolve, then hg rebase --continue)
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 55•7 years ago
|
||
Pushed by pehrsons@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/482f8d5746ba
Re-enable the crashing test. r=jib
https://hg.mozilla.org/integration/autoland/rev/f8acbd315ca8
Delay blocking MediaManager::PostTask until media thread shutdown. r=jib
https://hg.mozilla.org/integration/autoland/rev/01225b5242f5
Remove window listeners explicitly during shutdown. r=jib
Comment 56•7 years ago
|
||
bugherder |
Assignee | ||
Comment 57•7 years ago
|
||
Ah. I never meant to keep leave-open with the fix.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
Comment 58•7 years ago
|
||
Did this need an uplift request?
Comment 59•7 years ago
|
||
There are no crashes in 58. Track 58- for now and I think there is no need to uplift to 58.
Assignee | ||
Comment 60•7 years ago
|
||
Comment on attachment 8924879 [details]
Bug 1412394 - Delay blocking MediaManager::PostTask until media thread shutdown.
Approval Request Comment
[Feature/Bug causing the regression]: This has been around a long time, perhaps since the beginning of MediaManager.
[User impact if declined]: Possible (controlled, MOZ_CRASH()) crash when closing a window.
[Is this code covered by automated tests?]: Yes, autophone has successfully detected this, but only when it was recently enabled for these tests.
[Has the fix been verified in Nightly?]: I just verified
[Needs manual test from QE? If yes, steps to reproduce]: Would be helpful, see bug 1414422 for a testcase.
Put this on a localhost server (like `python -m SimpleHTTPServer`).
In the build you're testing, set "dom.allow_scripts_to_close_windows" to `true`.
Open the testcase in a single tab in the build you're testing.
When notified, allow popups for localhost.
[List of other uplifts needed for the feature/fix]:None
[Is the change risky?]:No
[Why is the change risky/not risky?]:There are two patches to this uplift:
This patch delays the point where we'd trigger a crash a bit. This is safe as this point is still before tearing down a thread, which is what the crash is protecting.
The next patch changes how we iterate over a data structure, it doesn't touch any ordering or other logic.
[String changes made/needed]:None
Attachment #8924879 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 61•7 years ago
|
||
52 is unaffected. The shutdown runnable is running fine there. Seemingly because there's an OnNavigation() call there that cleans up.
(In reply to Andreas Pehrson [:pehrsons] from comment #60)
> [Needs manual test from QE? If yes, steps to reproduce]: Would be helpful,
> see bug 1414422 for a testcase.
> Put this on a localhost server (like `python -m SimpleHTTPServer`).
> In the build you're testing, set "dom.allow_scripts_to_close_windows" to
> `true`.
> Open the testcase in a single tab in the build you're testing.
> When notified, allow popups for localhost.
Scratch this. It seems like 58 is no longer affected by the testcase in bug 1414422, probably thanks to bug 1405599.
It could still be affected by the autophone test however.
Flags: needinfo?(apehrson)
Assignee | ||
Comment 62•7 years ago
|
||
Comment on attachment 8925523 [details]
Bug 1412394 - Remove window listeners explicitly during shutdown.
Approval Request Comment
[Feature/Bug causing the regression]: Unclear, we thin this is a latent bug that exposed post-52.
[User impact if declined]: Possible (controlled, MOZ_CRASH()) crash when closing a window.
[Is this code covered by automated tests?]: Yes, autophone has successfully detected this, but only when it was recently enabled for these tests.
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]: No
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: No
[Why is the change risky/not risky?]: See the answer in comment 60
[String changes made/needed]: None
Attachment #8925523 -
Flags: approval-mozilla-beta?
Comment hidden (Intermittent Failures Robot) |
Comment 64•7 years ago
|
||
Comment on attachment 8924879 [details]
Bug 1412394 - Delay blocking MediaManager::PostTask until media thread shutdown.
I didn't see any crashes in 58. I think we don't need to uplift this to Beta58 at this point. Beta58- and won't fix for 58. Feel free to nominate again if you disagree.
Attachment #8924879 -
Flags: approval-mozilla-beta? → approval-mozilla-beta-
Updated•7 years ago
|
Attachment #8925523 -
Flags: approval-mozilla-beta? → approval-mozilla-beta-
Updated•7 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•5 years ago
|
Flags: needinfo?(rjesup)
You need to log in
before you can comment on or make changes to this bug.
Description
•