Closed
Bug 1214675
Opened 10 years ago
Closed 10 years ago
Crash in mozilla::camera::PCameras::Transition(mozilla::camera::PCameras::State, mozilla::ipc::Trigger, mozilla::camera::PCameras::State*)
Categories
(Core :: WebRTC: Audio/Video, defect, P2)
Core
WebRTC: Audio/Video
Tracking
()
RESOLVED
FIXED
mozilla44
| backlog | webrtc/webaudio+ |
People
(Reporter: adalucinet, Assigned: gcp)
Details
(Keywords: regression, reproducible)
Crash Data
Attachments
(1 file, 1 obsolete file)
|
8.12 KB,
patch
|
jesup
:
review+
|
Details | Diff | Splinter Review |
Reproducible with latest 43.0a2 Developer Edition and 44.0a1 Nightly, both e10s enabled/disabled
Affected platform: Windows 7 64-bit, Windows 8 64-bit, Ubuntu 14.04 32-bit
Steps to reproduce:
1. Start Firefox and Click on Hello icon (chat bubble with smiley face).
2. Click on "Start a conversation" and share any window.
3. Restart/Close Firefox.
Expected result: No crash.
Actual result: bp-4b0b12bb-0cf9-40ee-8764-dd34a2151014
Additional info:
1. This signature showed up only at the first crash. The next signatures look like bug 1116884, although is reproducible with e10s disabled too:
bp-158d7b26-caa0-46ca-acc3-db7742151014
bp-0a3b3c90-bff5-4d5c-821c-a7c662151014
bp-32df93c4-53c7-48e7-b66a-b6d922151014
2. Not reproducible with 42 beta 6 nor with Mac OS X 10.10.4.
Updated•10 years ago
|
Component: Client → WebRTC: Audio/Video
Product: Hello (Loop) → Core
| Reporter | ||
Updated•10 years ago
|
Crash Signature: [@ mozalloc_abort(char const* const) | NS_DebugBreak | mozilla::camera::PCameras::Transition(mozilla::camera::PCameras::State, mozilla::ipc::Trigger, mozilla::camera::PCameras::State*) ]
Version: unspecified → Trunk
| Reporter | ||
Comment 1•10 years ago
|
||
Regression range (m-c):
Last good: 2015-08-31
First bad: 2015-09-01 → bp-43b61d54-32a1-4b05-ae09-a1dae2151019
Pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=f2518b8a7b97b5bb477e94bc9131584007aac887&tochange=cafb1c90f794a73100a8f0afb9fe3301df0f2bde
Keywords: regression,
reproducible
Comment 2•10 years ago
|
||
(In reply to Alexandra Lucinet, QA Mentor [:adalucinet] from comment #1)
> Regression range (m-c):
> Last good: 2015-08-31
> First bad: 2015-09-01 → bp-43b61d54-32a1-4b05-ae09-a1dae2151019
Unfortunately that just tells us this started to fail when the feature landed.
Sounds like some sort of IPC shutdown hang triggering the shutdown watchdog (for the later ones). The first appears to be here, in Transition():
case __Dead:
NS_RUNTIMEABORT("__delete__()d actor");
Flags: needinfo?(gpascutto)
Updated•10 years ago
|
backlog: --- → webrtc/webaudio+
Rank: 21
Priority: -- → P2
| Assignee | ||
Comment 3•10 years ago
|
||
I suspect this is a duplicate of bug 1209987. The fix for that landed the same day this was reported, so it was probably not in. Can QA re-verify on latest Nightly?
Flags: needinfo?(gpascutto) → needinfo?(alexandra.lucinet)
| Reporter | ||
Comment 4•10 years ago
|
||
(In reply to Gian-Carlo Pascutto [:gcp] from comment #3)
> I suspect this is a duplicate of bug 1209987. The fix for that landed the
> same day this was reported, so it was probably not in. Can QA re-verify on
> latest Nightly?
Still reproducible with latest 44.0a1 (from 2015-10-20) → bp-4bf78de6-2129-4e73-adbf-7b84f2151021
Flags: needinfo?(alexandra.lucinet)
| Assignee | ||
Comment 5•10 years ago
|
||
This doesn't reproduce for me following the given STR with
Nightly on 64-bit Linux
Nightly on Windows 7 64-bit
Nightly on Windows 8.1 64-bit
I'm puzzled because comment 4 implies it's very reproducible.
I investigated the code and I see something suspect, but without a reproducible testcase it's going to be guesswork.
| Assignee | ||
Comment 6•10 years ago
|
||
CloseEngines calls RecvStopCapture but the latter has is missing a check
for an alive IPC connection. I'll provide a try build as I can't reproduce
the original failure, but I'm reasonably confident the patch will solve it.
Attachment #8679495 -
Flags: review?(rjesup)
| Assignee | ||
Comment 7•10 years ago
|
||
Updated•10 years ago
|
Attachment #8679495 -
Flags: review?(rjesup) → review+
| Assignee | ||
Comment 8•10 years ago
|
||
| Assignee | ||
Comment 9•10 years ago
|
||
Attachment #8679986 -
Flags: review?(rjesup)
| Assignee | ||
Updated•10 years ago
|
Attachment #8679495 -
Attachment is obsolete: true
Comment 10•10 years ago
|
||
Comment on attachment 8679986 [details] [diff] [review]
Factor out cleanup functions to avoid deadlock/dispatch-without-IPC
Review of attachment 8679986 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good!
Attachment #8679986 -
Flags: review?(rjesup) → review+
| Assignee | ||
Comment 11•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/692dbb16a94976c90cd7700604bb02be54f3d20b
Bug 1214675 - Factor out cleanup functions to avoid deadlock/dispatch-without-IPC. r=jesup
| Assignee | ||
Comment 12•10 years ago
|
||
Jesup, can you verify that this patch fixes the deadlock warnings you were seeing?
Flags: needinfo?(rjesup)
Comment 13•10 years ago
|
||
| bugherder | ||
Status: NEW → RESOLVED
Closed: 10 years ago
status-firefox44:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
| Assignee | ||
Comment 14•10 years ago
|
||
According to the reporter, this affects Firefox 43 (current beta). This is caused by bug 1104619. Unfortunately the fix in this bug also depends on the fixes for bug 1209987, which is very complicated, and would need to go onto beta as well (and probably everything inbetween, such as bug 1205164). I could do a simplified fix for beta, but that would be entirely untested code then.
The fact that the STR in the original report are such a basic Hello use case would make me think there's pressing reasons to uplift those patches anyway, but I was never able to reproduce them (see comment 5).
So without having any idea how broken Firefox 43 is in this regard, I can't even really make a judgement call whether to ask approval for mozilla-beta or not :-/
status-firefox43:
--- → affected
Flags: needinfo?(mreavy)
Comment 15•10 years ago
|
||
Gian-Carlo -- Can you do a risk analysis (based on what you know) of taking the patch in Fx43 vs not? Basically, write an uplift request here in the bug without asking for uplift. I'll then circle with Jesup and decide if we really need this in Fx43 or not. One or both of us will ping you if we have any questions when we read your analysis. Thanks!
Assignee: nobody → gpascutto
Flags: needinfo?(mreavy) → needinfo?(gpascutto)
| Assignee | ||
Comment 16•10 years ago
|
||
[Feature/regressing bug #]: Bug 1104619 - Sandboxing/Proxy for video capture
[User impact if declined]: Potential hang on shutdown, likely requires a specific timing. I did not reproduce it on any of my machines, although from this bug obviously our QA does.
[Describe test coverage new/current, TreeHerder]: Our tests exercise the code but due to the likelihood of this being timing specific won't really tell us anything. Bug 1209987, the main dependency of this patch, landed on Nightly a month ago, and the patch here about 3 weeks ago.
[Risks and why]: The patch here requires a bunch of other patches that substantially modify the relevant code and are quite complex, and could cause other hangs/crashes during webrtc use.
[String/UUID change made/needed]: NA
Flags: needinfo?(gpascutto)
Comment 17•10 years ago
|
||
Thanks, Gian-Carlo, for your analysis. I talked with Jesup, and we agree that (given risks vs rewards) we should just let this fix ride the trains.
Also, Jesup verifies that he's no longer seeing the deadlock warnings with this fix.
Flags: needinfo?(rjesup)
| Reporter | ||
Comment 18•10 years ago
|
||
Confirming that the crash is no longer reproducible with latest 45.0a1 (from 2015-11-17), under Windows 7 64-bit and Ubuntu 12.04 32-bit. Although, latest 44.0a1 with e10s enable/disabled is still crashing under Windows 7, but not on Ubuntu, with the same STR as in comment 0.
Crash report:
→ bp-b6285491-c5c0-479f-bdf4-d45ac2151118
→ bp-7c281621-617c-45d4-9849-d37192151118
→ bp-c8d0e367-71b9-416a-8fcb-82a332151118
Gian-Carlo, any ideas? It might happen because of the dependency bug(s) mentioned in comment 16?
Flags: needinfo?(gpascutto)
| Reporter | ||
Comment 19•10 years ago
|
||
Sorry, it's 44.0a2 Aurora still crashing, not 44.0a1.
| Assignee | ||
Comment 20•10 years ago
|
||
Unless I missed something, there are no patches I mentioned that are in Firefox 44 but not in 45. So I have no idea how 45 could be working, but not 44.
I guess the crashes are just a bit random and timing dependent, as already hypothesized in the past comments?
>https://crash-stats.mozilla.com/report/index/7c281621-617c-45d4-9849-d37192151118
This is an old build that predates this patch.
Flags: needinfo?(gpascutto)
| Reporter | ||
Comment 21•10 years ago
|
||
(In reply to Gian-Carlo Pascutto [:gcp] from comment #20)
> I guess the crashes are just a bit random and timing dependent, as already
> hypothesized in the past comments?
No, it's not random nor timing dependent - every time I follow STR via comment 0, latest 44.0a2 crashes but with different signatures:
- bp-48e24e47-3bf1-49c8-8888-a18a82151119 (same as bug 1225266)
- bp-34c6b010-1566-468c-98eb-db6272151119 (same as bug 1225266)
- bp-a06cd444-5be9-443b-8835-975992151119
> >https://crash-stats.mozilla.com/report/index/7c281621-617c-45d4-9849-d37192151118
>
> This is an old build that predates this patch.
Sorry, pasted the wrong link.
| Assignee | ||
Comment 22•10 years ago
|
||
(In reply to Alexandra Lucinet, QA Mentor [:adalucinet] from comment #21)
> (In reply to Gian-Carlo Pascutto [:gcp] from comment #20)
>
> > I guess the crashes are just a bit random and timing dependent, as already
> > hypothesized in the past comments?
> No, it's not random nor timing dependent - every time I follow STR via
> comment 0, latest 44.0a2 crashes but with different signatures:
But it's happening in 44 and not in 45?!
I don't understand at all and I can't reproduce this on any machine nor with any OS I have - and I tried a lot.
Is it possible to spell out the STR in as much exact detail as possible? For example '"Start a conversation" and share any window."' isn't really possible with Hello, there are some extra clicks and actions involved there. Which window or tab do you share. Are there any other tabs open? Empty profile?
We must be doing something subtly different if this reproduces 100% for you and never for me.
Flags: needinfo?(alexandra.lucinet)
| Reporter | ||
Comment 23•10 years ago
|
||
(In reply to Gian-Carlo Pascutto [:gcp] from comment #22)
> (In reply to Alexandra Lucinet, QA Mentor [:adalucinet] from comment #21)
> > (In reply to Gian-Carlo Pascutto [:gcp] from comment #20)
> >
> > > I guess the crashes are just a bit random and timing dependent, as already
> > > hypothesized in the past comments?
> > No, it's not random nor timing dependent - every time I follow STR via
> > comment 0, latest 44.0a2 crashes but with different signatures:
>
> But it's happening in 44 and not in 45?!
Yes, 44.0a2 was mentioned above (please see comment 18 and comment 19).
> I don't understand at all and I can't reproduce this on any machine nor with
> any OS I have - and I tried a lot.
>
> Is it possible to spell out the STR in as much exact detail as possible? For
> example '"Start a conversation" and share any window."' isn't really
> possible with Hello, there are some extra clicks and actions involved there.
> Which window or tab do you share. Are there any other tabs open? Empty
> profile?
STR:
1. Launch latest 44.0a2 with a clean profile.
2. Click on Hello icon.
3. Click on 'Get Started' button.
4. Click on 'Start a conversation' button as soon as Hello panel shows up.
5. In the newly opened conversation window, click on 'Share your screen' button and select 'Share other windows' option.
6. From the 'Window to share' dropdown select any listed window and click on 'Share Selected items' button.
7. Restart Firefox.
*maybe here try to restart via Developer toolbar.
Also reproduced with 2 different Windows 10 32-bit machines:
- bp-6a43ae34-3208-44cf-b85d-fdfdc2151119
- bp-529146d5-8b7d-4410-bfd0-7c3a62151119
Let me know if I can help more.
Flags: needinfo?(alexandra.lucinet)
| Assignee | ||
Comment 24•10 years ago
|
||
I think I was able to reproduce this with 44, once. The "restart" command may be a critical point here.
Also, while trying to log/trace this I stumbled into bug 1218799. So that's a difference between 44 and 45 that may affect the reproducibility of this bug.
Updated•10 months ago
|
tracking-firefox135:
--- → ?
Updated•10 months ago
|
tracking-firefox135:
? → ---
You need to log in
before you can comment on or make changes to this bug.
Description
•