Closed
Bug 1261321
Opened 9 years ago
Closed 9 years ago
crash in mozilla::layers::CompositableClient::IsConnected
Categories
(Core :: Graphics: Layers, defect)
Tracking
()
People
(Reporter: philipp, Assigned: nical)
References
Details
(Keywords: crash, regression)
Crash Data
Attachments
(1 file, 1 obsolete file)
1.31 KB,
patch
|
jnicol
:
review+
lizzard
:
approval-mozilla-aurora+
lizzard
:
approval-mozilla-beta+
lizzard
:
approval-mozilla-release+
Sylvestre
:
approval-mozilla-esr45+
|
Details | Diff | Splinter Review |
[Tracking Requested - why for this release]:
newly introduced crash in 45
This bug was filed from the Socorro interface and is
report bp-61743e6e-3cd2-4e8e-b8ac-a9adc2160320.
=============================================================
Crashing Thread (20)
Frame Module Signature Source
0 xul.dll mozilla::layers::CompositableClient::IsConnected() gfx/layers/client/CompositableClient.cpp
1 xul.dll mozilla::layers::CompositableClient::Destroy() gfx/layers/client/CompositableClient.cpp
2 xul.dll mozilla::layers::ImageBridgeShutdownStep1 gfx/layers/ipc/ImageBridgeChild.cpp
3 xul.dll RunnableFunction<void (*)(mozilla::layers::ImageClient*, mozilla::layers::PImageContainerChild*), mozilla::Tuple<mozilla::layers::ImageClient*, mozilla::layers::PImageContainerChild*> >::Run() ipc/chromium/src/base/task.h
4 xul.dll MessageLoop::DoWork() ipc/chromium/src/base/message_loop.cc
5 xul.dll base::MessagePumpDefault::Run(base::MessagePump::Delegate*) ipc/chromium/src/base/message_pump_default.cc
6 xul.dll MessageLoop::RunHandler() ipc/chromium/src/base/message_loop.cc
7 xul.dll MessageLoop::Run() ipc/chromium/src/base/message_loop.cc
8 xul.dll base::Thread::ThreadMain() ipc/chromium/src/base/thread.cc
9 xul.dll mozilla::LazyIdleThread::LazyIdleThread(unsigned int, nsACString_internal const&, mozilla::LazyIdleThread::ShutdownMethod, nsIObserver*) xpcom/threads/LazyIdleThread.cpp
10 @0x164ef86f
11 ntdll.dll __RtlUserThreadStart
12 ntdll.dll _RtlUserThreadStart
this is a new crash signature in firefox 45 builds and upwards which has
apparently been introduced by the work in bug 121976.
it is mid- to low-level in volume - on 45.0.1 it is #83 & making up 0.15% of
all crashes.
Assignee | ||
Comment 1•9 years ago
|
||
Shutdown bug. Apparently, ImageClient::FromIPDLActor returned nullptr in ImageBridgeShutdownStep1, which is somewhat unexpected. A simple null-check will be enough to fix the crash, although I'd like to understand how we got in this state.
Assignee: nobody → nical.bugzilla
Let's get a simple patch "just stop the crash" ready and into review while we're figuring out the cause?
This is blocking a wrong bug - missing a digit. Do we know the actual bug number?
Flags: needinfo?(madperson)
Reporter | ||
Comment 4•9 years ago
|
||
sorry, that was bug 1219761 - got the wrong number from https://hg.mozilla.org/mozilla-central/rev/932318a4e150
Assignee | ||
Comment 5•9 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #2)
> Let's get a simple patch "just stop the crash" ready and into review while
> we're figuring out the cause?
I am working on a fix that should fix this in a reliable way. If we just null-check we'll run into a similar problem as bug 1241875. It's easier to fix this race condition for CompoaitableClient than TextureClient so I should have a patch ready tomorrow.
Assignee | ||
Comment 6•9 years ago
|
||
The fix is getting a bit more involved than I'd like to but there are several issues at play and fixing one mostly makes us fall on the others.
The root of the problem, from a high level perspective, is that we have written the gfx ipdl actors with the idea that each actors is fully owned by one object (for example CompositableClient fully owns CompositableChild) and that object's destruction drives the actor's destruction. This is mostly true, except when IPDL shuts down and forces the destruction of the actor.
Orchestrating this all becomes a bit complicated when threads get involved.
CompositableClient and TextureClient have the same problems (Well, TextureClient's case is more complex but I've tried to leave that out for a followup patch as much as possible), so it makes sense to fix this at the ChildActor<T> level (TextureChild and COmpositableChild both inherit from it).
This patch does a bunch of things:
- rename DestroyIPDLActor -> DeallocIPDLActor to avoid confusion with the Destroy() methods which involve doing the destruction handshake and sending messages, while DeallocIPDLActor is the hook that gets called when IPDL wants to delete the actor (at this point we can't send messages).
- expose "ReleaseActor" rather than "Destroy" to make it clear that after this is called the TextureClient/CompositableClient should NOT keep their pointer to the actor (which may just have been deleted).
- track what happens and in which order for the following things:
- ***Client is destroyed and calls ReleaseActor
- IPDL shuts down (normal or abnormal shutdown)
- the ***Child actor's destruction handshake is complete
And make sure that the actor is deleted at the right time, ideally when the destruction handshake finishes, if and only if the ***Client has already released its reference to the the ***Child actor (this is where most of the complexity is).
When all the pieces come together other crashes like bug 1241875 should get fixed as well.
Assignee | ||
Comment 7•9 years ago
|
||
Getting the more-correct-but-complicated approach to work is proving harder than I expected, so in the mean-time, let's just null-check to avoid some of the crashes. I'll pursue the other approach in another bug.
Attachment #8738185 -
Attachment is obsolete: true
Attachment #8738626 -
Flags: review?(jnicol)
Comment 8•9 years ago
|
||
Comment on attachment 8738626 [details] [diff] [review]
Null check
Review of attachment 8738626 [details] [diff] [review]:
-----------------------------------------------------------------
Do we want a warning when compositable is null?
Attachment #8738626 -
Flags: review?(jnicol) → review+
Assignee | ||
Comment 9•9 years ago
|
||
(In reply to Jamie Nicol [:jnicol] from comment #8)
> Do we want a warning when compositable is null?
This is going to change soon-ish, and there will be better places to report what puts us in this state. So for now let's just paper over the null dereference.
Comment 10•9 years ago
|
||
Reporter | ||
Comment 11•9 years ago
|
||
[Tracking Requested - why for this release]:
newly introduced crash in 45
tracking-firefox45:
? → ---
tracking-firefox-esr45:
--- → ?
Comment 12•9 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Updated•9 years ago
|
status-firefox-esr45:
--- → affected
Comment 13•9 years ago
|
||
Nicolas, could you fill the uplift request for this crash? The volume is quite important. Thanks
Flags: needinfo?(nical.bugzilla)
Assignee | ||
Comment 14•9 years ago
|
||
Comment on attachment 8738626 [details] [diff] [review]
Null check
Approval Request Comment
[Feature/regressing bug #]:
[User impact if declined]: crashes
[Describe test coverage new/current, TreeHerder]: none
[Risks and why]: just the simples null-check. Absolutely no risk
[String/UUID change made/needed]: none
Flags: needinfo?(nical.bugzilla)
Attachment #8738626 -
Flags: approval-mozilla-beta?
Attachment #8738626 -
Flags: approval-mozilla-aurora?
Comment 15•9 years ago
|
||
Comment on attachment 8738626 [details] [diff] [review]
Null check
[Triage Comment]
Crash fix, recent regression, let's take this fix for 46.
Please uplift to aurora, beta, m-r.
Attachment #8738626 -
Flags: approval-mozilla-release+
Attachment #8738626 -
Flags: approval-mozilla-beta?
Attachment #8738626 -
Flags: approval-mozilla-beta+
Attachment #8738626 -
Flags: approval-mozilla-aurora?
Attachment #8738626 -
Flags: approval-mozilla-aurora+
Comment 16•9 years ago
|
||
This would land in an RC2 for 46, later this week. Too late for rc1
Updated•9 years ago
|
Comment 17•9 years ago
|
||
bugherder uplift |
Comment 18•9 years ago
|
||
Updated•9 years ago
|
Attachment #8738626 -
Flags: approval-mozilla-esr45+
You need to log in
before you can comment on or make changes to this bug.
Description
•