Closed
Bug 1311644
Opened 9 years ago
Closed 9 years ago
Startup crash in mozilla::ipc::MessageChannel::AssertWorkerThread | mozilla::layers::PTextureChild::SendDestroy
Categories
(Core :: Graphics: Layers, defect, P3)
Tracking
()
RESOLVED
FIXED
mozilla52
People
(Reporter: philipp, Assigned: pchang)
Details
(Keywords: crash, regression, Whiteboard: [gfx-noted])
Crash Data
Attachments
(3 files)
|
58 bytes,
text/x-review-board-request
|
nical
:
review+
jcristau
:
approval-mozilla-beta+
ritu
:
approval-mozilla-release-
|
Details |
|
58 bytes,
text/x-review-board-request
|
nical
:
review+
|
Details |
|
1.44 KB,
patch
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-dc68b6ea-411c-4728-a591-0b93e2161020.
=============================================================
Crashing Thread (37)
Frame Module Signature Source
0 xul.dll mozilla::ipc::MessageChannel::AssertWorkerThread() obj-firefox/dist/include/mozilla/ipc/MessageChannel.h:451
1 @0x39
2 xul.dll mozilla::layers::PTextureChild::SendDestroy() obj-firefox/ipc/ipdl/PTextureChild.cpp:58
3 xul.dll mozilla::layers::PTextureChild::SendDestroy() obj-firefox/ipc/ipdl/PTextureChild.cpp:63
4 xul.dll mozilla::layers::DeallocateTextureClient(mozilla::layers::TextureDeallocParams) gfx/layers/client/TextureClient.cpp:357
5 xul.dll RunnableFunction<void (*)(mozilla::layers::TextureDeallocParams), mozilla::Tuple<mozilla::layers::TextureDeallocParams> >::Run() ipc/chromium/src/base/task.h:336
6 xul.dll MessageLoop::RunTask(already_AddRefed<mozilla::Runnable>) ipc/chromium/src/base/message_loop.cc:346
7 xul.dll MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask&&) ipc/chromium/src/base/message_loop.cc:354
8 xul.dll MessageLoop::DoWork() ipc/chromium/src/base/message_loop.cc:429
9 xul.dll base::MessagePumpDefault::Run(base::MessagePump::Delegate*) ipc/chromium/src/base/message_pump_default.cc:36
10 xul.dll MessageLoop::RunHandler() ipc/chromium/src/base/message_loop.cc:225
11 xul.dll MessageLoop::Run() ipc/chromium/src/base/message_loop.cc:205
12 xul.dll base::Thread::ThreadMain() ipc/chromium/src/base/thread.cc:180
13 xul.dll `anonymous namespace'::ThreadFunc ipc/chromium/src/base/platform_thread_win.cc:28
14 kernel32.dll BaseThreadInitThunk
15 ntdll.dll __RtlUserThreadStart
16 ntdll.dll _RtlUserThreadStart
this crash is regressing/spiking in firefox 50 beta builds - reports are annotated with "MOZ_RELEASE_ASSERT(mWorkerLoopID == MessageLoop::current()->id()) (not on worker thread!)" which was added by bug 1146471.
so far the crash is happening on win7-10 and 94% on startup - on 50.0b7 it's about 0.1% of all browser crashes.
correlations from 50.0b:
(100.0% in signature vs 00.10% overall) moz_crash_reason = MOZ_RELEASE_ASSERT(mWorkerLoopID == MessageLoop::current()->id()) (not on worker thread!)
(96.92% in signature vs 21.50% overall) "DXVA2D3D9+" in app_notes = true
(96.92% in signature vs 26.75% overall) "DXVA2D3D9?" in app_notes = true
(100.0% in signature vs 34.94% overall) reason = EXCEPTION_BREAKPOINT
(40.00% in signature vs 03.87% overall) cpu_info = GenuineIntel family 6 model 60 stepping 3 | 4
(26.15% in signature vs 02.00% overall) adapter_device_id = 0x041e
there are 2 single reports generated from 49.0.1 installations as well though:
https://crash-stats.mozilla.com/report/index/c09cd2ea-a14e-49dd-80a4-41b8e2161006
https://crash-stats.mozilla.com/report/index/cdbda033-9d5d-4e27-9464-827222161004
Updated•9 years ago
|
Comment 2•9 years ago
|
||
Philipp, do the 2 reports from 49.0.1 indicate this is a carry-over regression in 50 and thus Version should be set to 49?
Flags: needinfo?(madperson)
| Reporter | ||
Comment 3•9 years ago
|
||
i'm not sure at all.
in general when i see a distribution of crashes among firefox versions like this i'd suspect there is some new feature pref'd off by default (that a couple of release users have switched on manually), and that pref change is regularly riding the trains in pre-release builds...
Flags: needinfo?(madperson)
| Reporter | ||
Comment 4•9 years ago
|
||
a strange thing timing-wise would be that the first recorded instance of this signature was on 2016-09-29, whereas the first builds of 50.0b were already pushed to users a week earlier than that. so maybe some external factor is playing into this as well?
Updated•9 years ago
|
Flags: needinfo?(nical.bugzilla)
Priority: -- → P3
Whiteboard: [gfx-noted]
| Assignee | ||
Comment 5•9 years ago
|
||
I just discussed this with nical last week. I will work on the patch for this.
Assignee: nobody → howareyou322
Flags: needinfo?(nical.bugzilla)
| Assignee | ||
Comment 6•9 years ago
|
||
I think this bug was caused by de-allocating a texture client with null mAllocator in [1]. Therefore, we can't get correct ipdl thread to dispatch message.
After checking more, I found only SurfaceFactory_Basic doesn't have mAllocator[2].
If my assumption is true, it is possible to hit the condition in [3].
nical, I think we just add early return if allocator is NULL. How do you think?
[1]http://searchfox.org/mozilla-central/source/gfx/layers/client/TextureClient.cpp#354
[2]http://searchfox.org/mozilla-central/source/gfx/gl/SharedSurfaceGL.cpp#103
[3]http://searchfox.org/mozilla-central/source/gfx/layers/client/TextureClient.cpp#391
Flags: needinfo?(nical.bugzilla)
Comment 7•9 years ago
|
||
SurfaceFactory should only be used with WebGL, which is main-thread only right now. This crash is happening off the main thread so I suspect it is something else using the ImageBridge (probably video-related). The problem with early-returning if the allocator is null is that if the IPDL actor exists it is leaked, so I would rather we find out what exactly is causing the allocator to be null and make sure it does not happen.
Flags: needinfo?(nical.bugzilla)
It might be too late for 50, if we have a fix that seems safe we can consider uplifting for 50.1.0
| Assignee | ||
Comment 11•9 years ago
|
||
Here are two places that we might get null allocator, TextureClient::CreateForYCbCr and TextureClient::CreateForYCbCrWithBufferSize.
And there was a comment mentioned that " // The only reason we allow aAllocator to be null is for gtests".
| Comment hidden (mozreview-request) |
| Comment hidden (mozreview-request) |
| Assignee | ||
Comment 14•9 years ago
|
||
Just uploaded a patch to pass valid allocator in GTest, and then we could force caller to pass valid allocator when calling CreateForYCbCr and CreateForYCbCrWithBufferSize.
Flags: needinfo?(howareyou322)
| Assignee | ||
Comment 15•9 years ago
|
||
Just fixed compile error on windows.
The following is the latest try result.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=98d021ba76795ce7dc8b512bf326679aba964ce1
| Comment hidden (mozreview-request) |
Comment 17•9 years ago
|
||
| mozreview-review | ||
Comment on attachment 8808505 [details]
Bug 1311644 - Create YCbCr texture client with valid allocator,
https://reviewboard.mozilla.org/r/91344/#review91668
If the gtest that was expecting a null allocator causes issues, it's fine to remove it. It's not a great test, it hasn't ever found a bug that I am aware of since I added it and it doesn't test a real-world situation very well.
Attachment #8808505 -
Flags: review?(nical.bugzilla) → review+
Comment 18•9 years ago
|
||
| mozreview-review | ||
Comment on attachment 8808506 [details]
Bug 1311644 - Pass ImageBridge as allocator to create YCbCr texture client,
https://reviewboard.mozilla.org/r/91346/#review91672
Sure, if that works out. If you want to remove the test it's fine as well as I mentioned in the other review.
Attachment #8808506 -
Flags: review?(nical.bugzilla) → review+
| Assignee | ||
Comment 19•9 years ago
|
||
(In reply to Peter Chang[:pchang] from comment #15)
> Just fixed compile error on windows.
>
> The following is the latest try result.
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=98d021ba76795ce7dc8b512bf326679aba964ce1
I tried to revert my changes and submit another try. And I still got same failures. Therefore, these failures were not related to my changes.
| Assignee | ||
Comment 20•9 years ago
|
||
(In reply to Nicolas Silva [:nical] from comment #18)
> Comment on attachment 8808506 [details]
> Bug 1311644 - Pass ImageBridge as allocator to create YCbCr texture client,
>
> https://reviewboard.mozilla.org/r/91346/#review91672
>
> Sure, if that works out. If you want to remove the test it's fine as well as
> I mentioned in the other review.
I think we could still keep it because of the testing coverage.
Comment 21•9 years ago
|
||
Pushed by pchang@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/405843a8c845
Create YCbCr texture client with valid allocator, r=nical
https://hg.mozilla.org/integration/autoland/rev/ea948a05bfa4
Pass ImageBridge as allocator to create YCbCr texture client, r=nical
Comment 22•9 years ago
|
||
| bugherder | ||
https://hg.mozilla.org/mozilla-central/rev/405843a8c845
https://hg.mozilla.org/mozilla-central/rev/ea948a05bfa4
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
| Assignee | ||
Comment 23•9 years ago
|
||
Comment on attachment 8808505 [details]
Bug 1311644 - Create YCbCr texture client with valid allocator,
Approval Request Comment
[Feature/regressing bug #]:none
[User impact if declined]:has chance to call TextureDestroy with null allocator and hit the wrong workerthread assertion again.
[Describe test coverage new/current, TreeHerder]:treeherder
[Risks and why]: low risk because the changes add more error checking to prevent abnormal condition.
[String/UUID change made/needed]:none
Attachment #8808505 -
Flags: approval-mozilla-beta?
Attachment #8808505 -
Flags: approval-mozilla-aurora?
Comment 24•9 years ago
|
||
Comment on attachment 8808505 [details]
Bug 1311644 - Create YCbCr texture client with valid allocator,
Fix a crash. Aurora51+.
Attachment #8808505 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 25•9 years ago
|
||
has conflicts
grafting 374110:405843a8c845 "Bug 1311644 - Create YCbCr texture client with valid allocator, r=nical"
merging gfx/layers/client/TextureClient.cpp
warning: conflicts while merging gfx/layers/client/TextureClient.cpp! (edit, then use 'hg resolve --mark')
abort: unresolved conflicts, can't continue
(use 'hg resolve' and 'hg graft --continue')
Flags: needinfo?(howareyou322)
| Assignee | ||
Comment 27•9 years ago
|
||
Tomcat, please upload attachment 8810316 [details] [diff] [review] for aurora/beta.
Flags: needinfo?(cbook)
Comment 28•9 years ago
|
||
Comment on attachment 8808505 [details]
Bug 1311644 - Create YCbCr texture client with valid allocator,
Moving uplift request to beta/release since we're now after merge day
Attachment #8808505 -
Flags: approval-mozilla-release?
Attachment #8808505 -
Flags: approval-mozilla-beta?
Attachment #8808505 -
Flags: approval-mozilla-beta+
Attachment #8808505 -
Flags: approval-mozilla-aurora+
This seems like a low volume fix on 50, ~30 instances and the same on 50 RC. Wontfix for 50.
Comment on attachment 8808505 [details]
Bug 1311644 - Create YCbCr texture client with valid allocator,
I don't see the urgency of taking this fix since it is a pretty low volume crash signature. Let this one ride the 51 train.
Attachment #8808505 -
Flags: approval-mozilla-release? → approval-mozilla-release-
| Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(cbook)
Comment 31•9 years ago
|
||
| bugherder uplift | ||
https://hg.mozilla.org/releases/mozilla-beta/rev/bb9d3c4faa85
https://hg.mozilla.org/releases/mozilla-beta/rev/391efa53e7f8
Flags: in-testsuite+
You need to log in
before you can comment on or make changes to this bug.
Description
•