Closed Bug 782845 Opened 7 years ago Closed Last year

[Desktop Build]gaia change causes b2g to abort

Categories

(Core :: Layout, defect)

x86_64
Linux
defect
Not set

Tracking

()

RESOLVED WONTFIX
blocking-basecamp -

People

(Reporter: dhylands, Unassigned)

Details

(Whiteboard: b2g-desktop-builds)

Attachments

(1 file)

I noticed that B2G desktop running under linux was seeing the following assertion/crash when running many of the OOP apps.

[Parent 24410] ###!!! ABORT: retaining manager changed out from under us ... HELP!: '!shadowRoot || shadowRoot->Manager() == aManager', file /home/work/B2G-desktop/src/layout/ipc/RenderFrameParent.cpp, line 638

I did a bisection and the delivery of Bug 780340 is the oldest build which causes this problem.
I have to take that back. It seems that problem is intermittent, and while I see it more often when the patch from Bug 740340 is applied, I also see it with the previous revision.
No longer blocks: 780340
Desktop-only, not reproducible on device.
blocking-basecamp: ? → -
The workaround mentioned in bug 782411 comment 6 might ameliorate here too.
That didn't seem to have any effect.
STR

Checkout mozilla-central git revision f03ba688d68b8a04aed88d66769a59158cb33e38
(the revision of gecko isn't essential, but it needs to be compatible with gaia)

Checkout gaia revision 7c1ebc9f83b1e71c0806c8b5868b9100e7256c48

Build b2g-desktop (I tested under linux)
run b2g-desktop
Click on Calculator

I get the ABORT mentioned above

Checkout the previous revision of gaia: 83854d0e466e4682f23778e857cd1e499e990898
remake gaia, and b2g runs without the ABORT.
Summary: Bug 780340 causes a regression on B2G → gaia change causes b2g to abort
This is unfortunately keeping Dave from being able to test basically anything which involves OOP on desktop.  We obviously don't want to be in a place where we can only test B2G on the device.  So even though I agree it doesn't block release, we need to figure it out.
Dave, can you link the Gaia change?

Cjones, any other ideas?
No.  Is anyone still seeing this?  Not much more we can do here without STR that work for someone who knows our layers code.  I've never seen this abort.
I still see this every time I try to launch an OOP app in a desktop build.
More specifically, when I de-blacklist the Email app and run it from the homescreen, I see this abort. When I de-blacklist it and run it with a command-line toggle, I don't see the abort.
Dietrich: Here's the link to the gaia revision that introduced the bug:
https://github.com/mozilla-b2g/gaia/commit/7c1ebc9f83b1e71c0806c8b5868b9100e7256c48
I just rebuilt my desktop build using the latest gaia and gecko, and I still see the ABORT when launching OOP apps.

I did notice the following though:

If I launch b2g desktop and launch Music or Gallery from the bottom while on the search pane, then they launch without the ABORT. 

If I swipe to the middle pane and then try to launch from the middle pane (which has a bunch of apps on it), then Muisc and Gallery both crash with the ABORT.
If I swipe to the middle pane and then swipe back to the search pane and then launch Music or Gallery from the bottom, then I see the ABORT.

So it seems to be that swiping the Homescreen activates the problem.
#8  0x00007f291783b165 in mozilla::layout::RenderFrameParent::BuildLayer (this=0x7f28fb743f40, aBuilder=0x7fffec8c58c0, aFrame=0x7f28fa653770, aManager=0x7f290072c740, aVisibleRect=..., aItem=0x7f28fb29b4e0)
    at /run/media/jdm/ssd/b2g/mozilla-central/layout/ipc/RenderFrameParent.cpp:661
661	  NS_ABORT_IF_FALSE(!shadowRoot || shadowRoot->Manager() == aManager,
(gdb) p shadowRoot->Manager()
$1 = (mozilla::layers::LayerManagerOGL *) 0x7f291af4ecc0
(gdb) p aManager()
Invalid data type for function to be called.
(gdb) p aManager
$2 = (mozilla::layout::RenderFrameParent::LayerManager *) 0x7f290072c740

Joe says he's never even heard of RenderFrameParent::LayerManager.
Attached file Backtrace
Here's a backtrace.
Robert, Joe indicated that you might have some ideas here. The abort at http://hg.mozilla.org/mozilla-central/file/e4757379b99a/layout/ipc/RenderFrameParent.cpp#l680 is firing, and Joe thinks that it might not be making valid assumptions. The aManager being passed in is a BasicLayerManager, which is explicitly created in at least one possible path (in nsDisplayList::PaintForFrame).
I tried adding the IsTempLayerManager(aManager) condition to the abort, and I no longer see crashes there. Instead, the other similar abort in BuildLayer is triggering, and this time the mContainer->Manager() is a previously-freed object. That doesn't feel like an improvement to me.
It sounds like RenderFrameParent::BuildLayer is called multiple times for the same RenderFrameParent object, with different aManagers? Logging should confirm that. If so, that shouldn't happen, and call stacks for those calls, plus inspection of the two aManagers, may shed light on the problem.
So I tried again today and everything is working fine in my desktop build (so I can no longer reproduce the problem).
Summary: gaia change causes b2g to abort → [Desktop Build]gaia change causes b2g to abort
Whiteboard: b2g-desktop-builds
Mass closing as we are no longer working on b2g/firefox os.
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → WONTFIX
Mass closing as we are no longer working on b2g/firefox os.
You need to log in before you can comment on or make changes to this bug.