Closed
Bug 861919
Opened 12 years ago
Closed 4 years ago
Lots of "ASSERTION: Should be in construction phase: 'InConstruction()'" at BasicLayerManager.cpp:533
Categories
(Core :: Graphics: Layers, defect)
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: kats, Unassigned)
Details
Attachments
(6 files, 8 obsolete files)
I'm running a local debug build of Fennec I'm getting a lot of assertion failure warnings in logcat. Pretty sure this wasn't happening last week, so I suspect it is a result of the layers refactoring. That's just a guess though.
I/Gecko ( 4415): ###!!! ASSERTION: Should be in construction phase: 'InConstruction()', file /Users/kats/zspace/mozilla-git/gfx/layers/basic/BasicLayerManager.cpp, line 533
Line 533 in my version of BasicaLayerManager.cpp is the assertion in BasicLayerManager::EndTransactionInternal.
Reporter | ||
Updated•12 years ago
|
OS: Mac OS X → Android
Hardware: x86 → All
Comment 1•12 years ago
|
||
I had a go at reproducing this, but couldn't (although I do remember seeing these assertions at some point in the refactoring so I would blame that). It is possible it has been fixed by one of the patches which landed recently.
Kats: were there any particular URLs that triggered this? If you could get a stack trace that would be awesome.
Flags: needinfo?(bugmail.mozilla)
Reporter | ||
Comment 2•12 years ago
|
||
I'm synced to m-c cset 7712e67a7c1e from last Friday and can reproduce this by loading mozilla.org and scrolling down. I grabbed a backtrace, it looks like mPhase is PHASE_NONE rather than PHASE_CONSTRUCTION.
Flags: needinfo?(bugmail.mozilla)
Comment 3•12 years ago
|
||
I wonder if bug 862236 is related.
Comment 4•12 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #3)
> I wonder if bug 862236 is related.
I don't think so, but I'm going to look at that bug next...
Comment 5•12 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #2)
> Created attachment 737989 [details]
> Backtrace at assertion failure
>
> I'm synced to m-c cset 7712e67a7c1e from last Friday and can reproduce this
> by loading mozilla.org and scrolling down. I grabbed a backtrace, it looks
> like mPhase is PHASE_NONE rather than PHASE_CONSTRUCTION.
Thanks kats! I had another go at reproducing and couldn't do so. I also went through the code and I don't see any obvious way we could end with PHASE_NONE here. I'll have a go with an older version, to check if this has been fixed.
Comment 6•12 years ago
|
||
Nope, couldn't repro with that build either. I'm officially giving up for now. If anyone comes up with different STR then I'll take another look.
Reporter | ||
Comment 7•12 years ago
|
||
I can still repro this after syncing up to m-c yesterday (I built with a few local changes but they were unrelated). I don't know if the device matters but I'm running on a Nexus 4. If you have a patch with debugging output or something you'd like me to try let me know.
Comment 8•12 years ago
|
||
Comment 9•12 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #7)
> I can still repro this after syncing up to m-c yesterday (I built with a few
> local changes but they were unrelated). I don't know if the device matters
> but I'm running on a Nexus 4. If you have a patch with debugging output or
> something you'd like me to try let me know.
Trying on a Nexus 7, I hope this is not hw dependent, it really shouldn't be. And we must be running very similar versions of Android.
Anyway, if you get a moment to try running with the logging patch I just posted, that should give me some clue as to what is going on. It will probably take a few iterations to find out properly though.
Reporter | ||
Comment 10•12 years ago
|
||
This is a full logcat dump that includes:
1) starting fennec and waiting for gecko init
2) tapping the mozilla.org topsite link on about:home (circa the 08:30:26 timestamp)
3) waiting for the page to load, and then doing a couple of pan/zoom operations on it until I saw the assertion go by in the log
4) Killing fennec
The first assertion shows up in the log at 08:31:08.281 and there's a few more after that.
Also I see that your log output is basically flooding the log - beware that the android logging goes to a circular buffer and so log output can get overwritten if it's not read out fast enough. I've run into this problem with missing logs before, and had to log to a file (in the code) to get around it. I don't know if your log volume is high enough to hit this problem but if you see things that don't make sense then this might be why.
Comment 11•12 years ago
|
||
Attachment #738787 -
Attachment is obsolete: true
Comment 12•12 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #10)
> Created attachment 738910 [details]
> logging output
>
> This is a full logcat dump that includes:
> 1) starting fennec and waiting for gecko init
> 2) tapping the mozilla.org topsite link on about:home (circa the 08:30:26
> timestamp)
> 3) waiting for the page to load, and then doing a couple of pan/zoom
> operations on it until I saw the assertion go by in the log
> 4) Killing fennec
>
> The first assertion shows up in the log at 08:31:08.281 and there's a few
> more after that.
>
> Also I see that your log output is basically flooding the log - beware that
> the android logging goes to a circular buffer and so log output can get
> overwritten if it's not read out fast enough. I've run into this problem
> with missing logs before, and had to log to a file (in the code) to get
> around it. I don't know if your log volume is high enough to hit this
> problem but if you see things that don't make sense then this might be why.
Thanks for the logs! They are interesting, something weird is going on. So I put some more detail into the logging code, if you could give it a run when you get a chance that would be great, thanks!
Reporter | ||
Comment 13•12 years ago
|
||
Attached. It took much longer this time to hit the assertion; not sure why. Also I started hitting another assertion in GLContext.cpp.
Reporter | ||
Updated•12 years ago
|
Attachment #738910 -
Attachment is obsolete: true
Comment 15•12 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #14)
> Not waiting on me here, right?
no, still trying to make sense of the logs - there are more layer managers than I expected and one of them seems to go rogue, I can't figure out why. Will probably request some more logging at some point. Thanks!
Flags: needinfo?(ncameron)
Comment 16•12 years ago
|
||
So it looks like we are calling EndTransaction without BeginTransaction, which would cause the assertion, but it's not obvious why. I narrowed it down to four candidates, but need more info to narrow it down any further, so another round of logging. Stat holiday tomorrow, so no rush. Thanks!
Attachment #739271 -
Attachment is obsolete: true
Flags: needinfo?(bugmail.mozilla)
Reporter | ||
Comment 17•12 years ago
|
||
Logging from round 3. As with round 2 it took me quite a while to generate the assertion, not sure why that is.
Attachment #739622 -
Attachment is obsolete: true
Flags: needinfo?(bugmail.mozilla)
Reporter | ||
Comment 18•12 years ago
|
||
Here's another run using round 3 with a shorter log and more assertions. Turns out the trick to get the assertions is scrolling down closer to the bottom of mozilla.org (I did one fling down to halfway down the page, wait half a second, and then did another fling down)
Comment 19•12 years ago
|
||
Great, thanks Kats!
So, the logs show that we are calling EndTransaction from here:
http://mxr.mozilla.org/mozilla-central/source/layout/base/FrameLayerBuilder.cpp#1959
Without calling BeginTransaction. I am 99% sure that this is not due to the layers refactoring as we didn't really touch FrameLayerBuilder and certainly not any calls to *Transaction.
I'll see if I can repro myself with the more detailed STR.
Comment 20•12 years ago
|
||
My guess here is that this is happening during a repeat transaction.
For inactive layers, we call BeginTransaction during layer constructor for the parent layer manager, and then we should hit the above EndTransaction call when painting the ThebesLayer.
If we have a repeat transaction (where we do only some of the ThebesLayer painting, send it to the compositor, then do the remainder), then we will paint the same ThebesLayer multiple times without extra layer building steps.
If the inactive layer content lies across the boundary, such that it gets drawn both times, then this assertion seems expected.
Shouldn't be hard to confirm, add prints around layer building, and when we hit a repeat transaction in BasicShadowableLayerManager::EndTransaction.
Assuming that is the case, then the assertion shouldn't cause any problems and we can just work around it.
Comment 21•12 years ago
|
||
OK, this tests Matt's repeat transaction hypothesis (which seems totally valid to me). If we are right, then this will be the last round of logging.
Assignee: nobody → ncameron
Attachment #741218 -
Attachment is obsolete: true
Flags: needinfo?(bugmail.mozilla)
Comment 22•12 years ago
|
||
If kats's logging confirms this is due to repeat transactions, then I don't think there is a nice fix, because when we trigger the repeat transaction, we have no way to get to the temporary layer manager which paints the inactive layer. The best we can do is not throw the toys out of the pram if we are in PHASE_NONE, which is a bit sad, but not too much.
Attachment #742893 -
Flags: review?(matt.woodrow)
Comment 23•12 years ago
|
||
Comment on attachment 742893 [details] [diff] [review]
sad fix
Review of attachment 742893 [details] [diff] [review]:
-----------------------------------------------------------------
How about checking InTransaction before we call EndTransaction, and call BeginTransaction() (or a new more specific function) if we're not in one.
That way we only stop the assertion for this specific case, and it can continue to be useful in others.
Comment 24•12 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #23)
> Comment on attachment 742893 [details] [diff] [review]
> sad fix
>
> Review of attachment 742893 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> How about checking InTransaction before we call EndTransaction, and call
> BeginTransaction() (or a new more specific function) if we're not in one.
>
> That way we only stop the assertion for this specific case, and it can
> continue to be useful in others.
Can we guarantee that we won't have a non-default render target? Because BeginTransaction will reset that.
And I guess we can actully check when we do this if we are in fact in a repeat transaction in the non-temp layer manager
Comment 25•12 years ago
|
||
> Can we guarantee that we won't have a non-default render target? Because
> BeginTransaction will reset that.
Yes, the transaction is always started here (http://mxr.mozilla.org/mozilla-central/source/layout/base/FrameLayerBuilder.cpp#2383) for the temporary managers.
Comment 26•12 years ago
|
||
kats: does this make the assertions go away?
Attachment #742893 -
Attachment is obsolete: true
Attachment #742893 -
Flags: review?(matt.woodrow)
Attachment #742923 -
Flags: review?(matt.woodrow)
Attachment #742923 -
Flags: feedback?(bugmail.mozilla)
Updated•12 years ago
|
Attachment #742923 -
Flags: review?(matt.woodrow) → review+
Reporter | ||
Comment 27•12 years ago
|
||
Output from logging round 4. The "happy fix" patch doesn't build on my current m-c checkout so I wasn't able to test that. I'll take a look at the build failure and see if I can rebase it appropriately.
/Users/kats/zspace/mozilla-git/layout/base/FrameLayerBuilder.cpp:3286:29: error: 'mozilla::FrameLayerBuilder::LayerManager' has no member named 'InTransaction'
/Users/kats/zspace/mozilla-git/layout/base/FrameLayerBuilder.cpp:3287:30: error: 'class mozilla::layers::LayerManager' has no member named 'GetRepeatTransaction'
Flags: needinfo?(bugmail.mozilla)
Reporter | ||
Comment 28•12 years ago
|
||
Comment on attachment 742923 [details] [diff] [review]
happy fix
Review of attachment 742923 [details] [diff] [review]:
-----------------------------------------------------------------
I stuck in a few static_casts to make it compile (not sure if that's right or not), but it didn't make the assertions go away. I will attach the updated patch that I used and the output.
Attachment #742923 -
Flags: feedback?(bugmail.mozilla) → feedback-
Reporter | ||
Comment 29•12 years ago
|
||
This is the modified version of "happy fix" that I used.
Reporter | ||
Comment 30•12 years ago
|
||
Updated•12 years ago
|
Attachment #742923 -
Attachment is obsolete: true
Updated•12 years ago
|
Attachment #743069 -
Attachment is obsolete: true
Comment 31•12 years ago
|
||
Hmm, so the last round of logging shows (I think) that a repeat transaction sometimes causes this, but there is some other reason which more often causes it and often a repeat transaction does not cause this issue. The latter is odd, I would have thought a repeat transaction would always cause these problems. The only way this could be a repeat transaction is if the layer manager we set the repeat transaction on is not the layer manager owned by the Thebes layer which is rendered (inactive) the second time around. I'm not sure if that is possible.
Updated•11 years ago
|
Assignee: ncameron → nobody
Comment 32•4 years ago
|
||
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•