Closed
Bug 933082
(RefcntAllocator)
Opened 11 years ago
Closed 11 years ago
Make it possible to hold on to an ISurfaceAllocator and check if its IPC connection is still open
Categories
(Core :: Graphics, defect)
Core
Graphics
Tracking
()
RESOLVED
FIXED
mozilla28
Tracking | Status | |
---|---|---|
firefox28 | --- | fixed |
People
(Reporter: bjacob, Assigned: bjacob)
References
(Blocks 2 open bugs)
Details
(Whiteboard: [qa-])
Attachments
(5 files, 4 obsolete files)
4.27 KB,
patch
|
nical
:
review+
|
Details | Diff | Splinter Review |
3.30 KB,
patch
|
nical
:
review+
|
Details | Diff | Splinter Review |
2.31 KB,
patch
|
nical
:
review+
|
Details | Diff | Splinter Review |
34.75 KB,
patch
|
nical
:
review+
|
Details | Diff | Splinter Review |
939 bytes,
patch
|
Details | Diff | Splinter Review |
In bug 914823, we have a crash in code like this in SharedSurface_Gralloc (see bug 914823 comment 71) in B2G 1.2:
if (mAllocator) {
mAllocator->DestroySharedSurface(&desc); // crash here
}
Here, mAllocator is currently a WeakPtr<ISurfaceAllocator>.
The crash is a segfault at address 0x18, which happens to be the offset of ISurfaceAllocator::DestroySharedSurface in ISurfaceAllocator's vtable (it being the 7th virtual method by order of declaration, we have (7 - 1) * sizeof(void*) == 6*4 == 24 == 0x18).
So it seems that mAllocator is null at the crash line, even though this is inside a if (mAllocator) block.
Maybe we have a race condition here, which would be surprising as this is not supposed to be accessed from another thread, and the crash reports' raw dumps don't have any thread in places that might race with this. Or maybe we have a bug in WeakPtr...
Anyway. The reason why this was made a WeakPtr in the first place is that we used to have crashes (earlier in the same bug 914823) with the ISurfaceAllocator dying unexpectedly, and we couldn't hold a strong pointer to it because this would have closed a strong reference cycle. Indeed, the ISurfaceAllocator here is the base class of the ClientLayerManager, which has a chain of strong references to us (basically, ClientLayerManager->ClientCanvasLayer->GLContext->GLScreenBuffer->SurfaceStream->SharedSurface_Gralloc)
We can fix that by making ClientLayerManager not inherit from ISurfaceAllocator anymore, instead hold itself a strong reference to an ISurfaceAllocator. It then becomes possible for our SharedSurface_Gralloc to also hold a strong reference to that ISurfaceAllocator, without closing a reference cycle.
Once we have that, we can be confident that the ISurfaceAllocator doesn't go away under our feet, but we have a new problem to solve: it becomes possible for the IPC channel to be closed under our feet even though the ISurfaceAllocator is alive. The IPC channel here is a LayerTransactionChild. The way that we can become able to tell if it was closed, is to hold a strong reference to it, and have a |mIPCOpen| flag on it that is set to false when the IPDL Dealloc*() method is called on it. Currently, LayerTransactionChild is not refcounted, and Dealloc just destroys it.
So our first prerequisite is to make LayerTransactionChild refcounted and add that flag (patch 1)
Once we have that, we add a IPCOpen virtual method in ISurfaceAllocator whose default impl just returns true, but which is overridden in ShadowLayerForwarder to return the right answer from LayerTransactionChild. DestroySharedSurface uses this flag to avoid actually sending messages after the channel is gone. (patch 2)
Next, we must be careful: as LayerTransactionChild is now refcounted, we must have ShadowLayerForwarder hold a strong reference to it, otherwise it could go away under its feet. (patch 3).
Finally, we can remove ClientLayerManager's inheritance of ShadowLayerForwarder and instead have ClientLayerManager hold a strong reference to a ShadowLayerForwarder. This allows to make ISurfaceAllocator (the base class) refcounted, and in turn this allows SharedSurface_Gralloc to hold a strong reference to ISurfaceAllocator (patch 4)
Assignee | ||
Comment 1•11 years ago
|
||
Assignee | ||
Comment 2•11 years ago
|
||
Assignee | ||
Comment 3•11 years ago
|
||
Assignee | ||
Comment 4•11 years ago
|
||
Comment 5•11 years ago
|
||
Is this bug required to fix the blocking bug? If so, then this should be set to koi?
Comment 6•11 years ago
|
||
If we're crashing at 0x18, then it means *mAllocator is null, not mAllocator.
Comment 7•11 years ago
|
||
And if mAllocator is just a raw pointer (https://bugzilla.mozilla.org/show_bug.cgi?id=914823#c86), then it makes sense that this could be a valid pointer to deleted memory.
It's surprising to me that the value at that address is 0x0 though, rather than garbage or the allocators poison value.
Assignee | ||
Comment 8•11 years ago
|
||
(In reply to Jason Smith [:jsmith] from comment #5)
> Is this bug required to fix the blocking bug? If so, then this should be set
> to koi?
We're actually still unsure, I just marked it as blocking preemptively... but the below means I was wrong!
(In reply to Matt Woodrow (:mattwoodrow) from comment #6)
> If we're crashing at 0x18, then it means *mAllocator is null, not mAllocator.
Dang! Sorry for being so stupid! Indeed it means that mAllocator->vptr is null, which means that it was already destroyed and overwritten by something!
Unblocking 914823.
Assignee | ||
Updated•11 years ago
|
Blocks: BadSurfaceDescriptor
Assignee | ||
Updated•11 years ago
|
Alias: RefcntAllocator
Assignee | ||
Updated•11 years ago
|
Blocks: KillSharedSurface
Assignee | ||
Comment 9•11 years ago
|
||
Getting back to this as it blocks bug 941389 and bug 941390.
Assignee | ||
Comment 10•11 years ago
|
||
Attachment #825037 -
Attachment is obsolete: true
Attachment #8338008 -
Flags: review?(nical.bugzilla)
Assignee | ||
Comment 11•11 years ago
|
||
Attachment #825039 -
Attachment is obsolete: true
Attachment #8338009 -
Flags: review?(nical.bugzilla)
Assignee | ||
Comment 12•11 years ago
|
||
Attachment #825040 -
Attachment is obsolete: true
Attachment #8338010 -
Flags: review?(nical.bugzilla)
Assignee | ||
Comment 13•11 years ago
|
||
Ready for review!
Attachment #825042 -
Attachment is obsolete: true
Attachment #8338011 -
Flags: review?(nical.bugzilla)
Assignee | ||
Comment 14•11 years ago
|
||
Updated•11 years ago
|
Attachment #8338008 -
Flags: review?(nical.bugzilla) → review+
Updated•11 years ago
|
Attachment #8338009 -
Flags: review?(nical.bugzilla) → review+
Updated•11 years ago
|
Attachment #8338010 -
Flags: review?(nical.bugzilla) → review+
Comment 15•11 years ago
|
||
Comment on attachment 8338011 [details] [diff] [review]
Part 4: let ClientLayerManager hold a ShadowLayerForwarder instead of inheriting from ShadowLayerForwarder. Make ISurfaceAllocator refcounted. Make SharedSurface_Gralloc keep its mAllocator alive.
Review of attachment 8338011 [details] [diff] [review]:
-----------------------------------------------------------------
Nice!
Attachment #8338011 -
Flags: review?(nical.bugzilla) → review+
Assignee | ||
Comment 16•11 years ago
|
||
Comment 17•11 years ago
|
||
Assignee | ||
Comment 18•11 years ago
|
||
Many thanks for the mac fix!
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/9cc4a77561e8
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/a249bd79e71a
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/1509f8be5df3
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/7caa8e80b06a
Assignee: nobody → bjacob
Target Milestone: --- → mozilla28
Comment 19•11 years ago
|
||
Backed out for mass mochitest asserts.
https://hg.mozilla.org/integration/mozilla-inbound/rev/641a86dd8f8d
https://tbpl.mozilla.org/php/getParsedLog.php?id=31108505&tree=Mozilla-Inbound
Target Milestone: mozilla28 → ---
Assignee | ||
Comment 20•11 years ago
|
||
Had forgotten to update a few {A,Dea}llocPLayerTransaction{Parent,Child}. I don't suppose that another cycle of reviews is needed for that.
https://tbpl.mozilla.org/?tree=Try&rev=ed22b705bb30
Assignee | ||
Comment 21•11 years ago
|
||
Had forgotten to initialize LayerTransactionParent::mIPCOpen to false >_<
https://tbpl.mozilla.org/?tree=Try&rev=fb4d542ac5d8
Assignee | ||
Comment 22•11 years ago
|
||
Had forgotten to remove the NS_INLINE_DECL_THREADSAFE_REFCOUNTING(ImageBridgeParent)
https://tbpl.mozilla.org/?tree=Try&rev=3c48adddf0f6
Assignee | ||
Comment 23•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/1cf4d557bf94
https://hg.mozilla.org/integration/mozilla-inbound/rev/849931c5c32e
https://hg.mozilla.org/integration/mozilla-inbound/rev/958fa3ca2b59
https://hg.mozilla.org/integration/mozilla-inbound/rev/b5efbd83b16b
Target Milestone: --- → mozilla28
Comment 24•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/1cf4d557bf94
https://hg.mozilla.org/mozilla-central/rev/849931c5c32e
https://hg.mozilla.org/mozilla-central/rev/958fa3ca2b59
https://hg.mozilla.org/mozilla-central/rev/b5efbd83b16b
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 25•11 years ago
|
||
status-firefox28:
--- → fixed
Whiteboard: [qa-]
You need to log in
before you can comment on or make changes to this bug.
Description
•