Closed Bug 899910 Opened 11 years ago Closed 11 years ago

Intermittent TEST-UNEXPECTED-FAIL | layout/base/tests/chrome/test_leaf_layers_partition_browser_window.xul | Leaf layers should form a non-overlapping partition of the browser window (maximized)

Categories

(Core :: Layout, defect, P5)

x86
Windows 7
defect

Tracking

()

RESOLVED FIXED
mozilla29
Tracking Status
firefox27 --- fixed
firefox28 --- fixed
firefox29 --- fixed
firefox-esr24 --- fixed
b2g-v1.2 --- fixed

People

(Reporter: Gijs, Assigned: smaug)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file, 6 obsolete files)

https://tbpl.mozilla.org/php/getParsedLog.php?id=25943870&tree=Mozilla-Inbound
Windows XP 32-bit mozilla-inbound debug test mochitest-other on 2013-07-30 19:35:56
revision: 57cda2d1472f
slave: t-xp32-ix-127

19:44:24     INFO -  12633 ERROR . [Set MOZ_DUMP_PAINT_LIST=1 env var to investigate.]

https://tbpl.mozilla.org/php/getParsedLog.php?id=24914201&tree=UX
Windows XP 32-bit ux debug test mochitest-other on 2013-07-03 16:52:35
revision: 070ae719d242
slave: t-xp32-ix-049

17:03:42     INFO -  12532 ERROR TEST-UNEXPECTED-FAIL | chrome://mochitests/content/chrome/layout/base/tests/chrome/test_leaf_layers_partition_browser_window.xul | Leaf layers should form a non-overlapping partition of the browser window (maximized). [Set MOZ_DUMP_PAINT_LIST=1 env var to investigate.]
Priority: -- → P5
Blocks: 930793
Comment on attachment 8345432 [details] [diff] [review]
possible patch

er, I wasn't going to try this.
Attachment #8345432 - Attachment is obsolete: true
Comment on attachment 8345439 [details] [diff] [review]
possible fix

tn, what do you think of this. Tryserver results look good, though I just
retriggered few more test runs.

This is blocking bug 930793, so would like to get this fixed soon.
Attachment #8345439 - Flags: review?(tnikkel)
Still looking good on try
Attached patch approach v2 (obsolete) — Splinter Review
No idea if this passes on try, but crossing fingers.
Attachment #8345439 - Flags: review?(tnikkel)
Comment on attachment 8346050 [details] [diff] [review]
approach v2

tn, assuming this passes on try, would this be ok to you.

Locally on linux it works fine, but the failure is windows-debug only.
Attachment #8346050 - Flags: review?(tnikkel)
Comment on attachment 8346050 [details] [diff] [review]
approach v2

I think we should add the resize listener before we call maximize, but otherwise this looks good.
Attached patch v3 (obsolete) — Splinter Review
Attachment #8345439 - Attachment is obsolete: true
Attachment #8346050 - Attachment is obsolete: true
Attachment #8346050 - Flags: review?(tnikkel)
Attachment #8346175 - Flags: review?(tnikkel)
Comment on attachment 8346175 [details] [diff] [review]
v3

Thanks!
Attachment #8346175 - Flags: review?(tnikkel) → review+
https://hg.mozilla.org/mozilla-central/rev/d15ed5648a5f
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Bah, no, with the remove-perf-mode the issue is still there occasionally.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment on attachment 8345439 [details] [diff] [review]
possible fix

This was working better.
Attachment #8345439 - Flags: review?(tnikkel)
Attachment #8345439 - Attachment is obsolete: false
Comment on attachment 8345439 [details] [diff] [review]
possible fix

Is the reason the resize listener didn't work because the paint from the resize came before the resize listener fired? What if we just wait for a paint and a resize event after calling maximize? Does that work?
I'm just pushing a patch with some debug stuff.

Debugging this is hard since I don't have windows dev environment, and the issue doesn't seem to happen
always even on try.
a bit different debug version https://tbpl.mozilla.org/?tree=Try&rev=e0eb8336294f
since the first one wasn't useful at all.
Looks like we get MozAfterPaint occasionally a bit later.
Trying a new patch.
er, no, the resize event patch should have waited a paint after resize, but that isn't enough.
And we should have flushed layout before painting so that resize doesn't happen after MozAfterPaint...
Both the MozAfterPaint and resize events fire async from when the paint/reflow actually happens which makes things more complicated.
No good.
We update something asynchronously, and there is no dom event for that something.
Do we change layer tree in some cases async?
Attachment #8345439 - Flags: review?(tnikkel)
Comment on attachment 8348934 [details] [diff] [review]
version 4, on top of what landed

Ahaa, this seems to work, at least initial results hint that
Attachment #8348934 - Flags: review?(tnikkel)
Attachment #8348934 - Flags: review?(tnikkel) → review+
Comment on attachment 8348934 [details] [diff] [review]
version 4, on top of what landed

Unfortunately, retriggering the test few times shows that this isn't good
after all :(
Comment on attachment 8345439 [details] [diff] [review]
possible fix

...and back to the original patch
Attachment #8345439 - Flags: review?(tnikkel)
Comment on attachment 8345439 [details] [diff] [review]
possible fix

Oh! I know whats the problem. doTest is called from the onload handler. We may not have even painted the new document yet at that point. What we should be doing is running a setTimout in the load handler. When that setTimeout fires we wait for a paint if it is pending, otherwise we can proceed to maximize the window.
v5 doesn't look like it worked either. :(

Can we just do the "setTimeout until it passes" from "possible fix" on Windows XP instead of all platforms? The test is already customized for WinXP.
Attachment #8346175 - Attachment is obsolete: true
Attachment #8349581 - Attachment is obsolete: true
Attachment #8348934 - Attachment is obsolete: true
(In reply to Timothy Nikkel (:tn) from comment #39)
> v5 doesn't look like it worked either. :(
> 
> Can we just do the "setTimeout until it passes" from "possible fix" on
> Windows XP instead of all platforms? The test is already customized for
> WinXP.

Why?
(In reply to Olli Pettay [:smaug] from comment #40)
> (In reply to Timothy Nikkel (:tn) from comment #39)
> > v5 doesn't look like it worked either. :(
> > 
> > Can we just do the "setTimeout until it passes" from "possible fix" on
> > Windows XP instead of all platforms? The test is already customized for
> > WinXP.
> 
> Why?

Because we don't understand the failure mode here, it would be nice not to have this hack bleed over into other platforms. Also WinXP runs this test differently (skips the non-maximized mode entirely) so that difference could be the cause.
This test has failed on all Win* platforms.
https://tbpl.mozilla.org/?tree=Try&rev=f734ab662254

bool gfxUtils::sDumpPaintList = true;
No luck with that. Too large logs.
I tried to only run the relevant test (or close to it). We'll see how the push goes
https://tbpl.mozilla.org/?tree=Try&rev=252e4783365a
Relevant bit from the win7 run from that log

D3D10LayerManager (0x166af990)
  D3D10ContainerLayer (0x126c42a0) [visible=< (x=0, y=0, w=1270, h=1022); >] [metrics={ viewport=(x=0.000000, y=0.000000, w=1270.000000, h=1022.000000) viewportScroll=(x=0.000000, y=0.000000) displayport=(x=0.000000, y=0.000000, w=0.000000, h=0.000000) scrollId=0 }]
    D3D10ThebesLayer (0xbaf3358) [visible=< (x=0, y=8, w=1270, h=71); >] [valid=< (x=0, y=8, w=1270, h=71); >]
    D3D10ThebesLayer (0xf41f380) [visible=< (x=0, y=0, w=1270, h=37); >] [valid=< (x=0, y=0, w=1270, h=37); >]
    D3D10ContainerLayer (0xff678e0) [clip=(x=0, y=79, w=1270, h=943)] [visible=< (x=0, y=79, w=1270, h=943); >] [opaqueContent]
      D3D10ContainerLayer (0x126b57a8) [visible=< (x=1253, y=1005, w=17, h=17); >] [opaqueContent]
        D3D10ThebesLayer (0xf41f568) [clip=(x=0, y=0, w=0, h=0)] [not visible]
        D3D10ColorLayer (0xf2130a0) [visible=< (x=1253, y=1005, w=17, h=17); >] [opaqueContent] [color=rgba(240, 240, 240, 1)]
      D3D10ThebesLayer (0xf41f750) [clip=(x=0, y=0, w=0, h=0)] [transform=[ 1 0; 0 1; 0 79; ]] [not visible]
      D3D10ColorLayer (0xf213268) [transform=[ 1 0; 0 1; 0 79; ]] [visible=< (x=0, y=0, w=1253, h=926); >] [opaqueContent] [color=rgba(255, 255, 255, 1)]
      D3D10ContainerLayer (0x126c1990) [visible=< (x=0, y=1005, w=1253, h=17); >] [opaqueContent] [hscrollbar=10]
        D3D10ThebesLayer (0xf41f938) [visible=< (x=0, y=1005, w=1253, h=17); >] [opaqueContent] [valid=< (x=0, y=1005, w=1253, h=17); >]
      D3D10ContainerLayer (0x126c4730) [visible=< (x=1253, y=79, w=17, h=926); >] [opaqueContent] [vscrollbar=10]
        D3D10ThebesLayer (0xf41fb20) [visible=< (x=1253, y=79, w=17, h=926); >] [opaqueContent] [valid=< (x=1253, y=79, w=17, h=926); >]
    D3D10ThebesLayer (0xf41fd08) [visible=< (x=0, y=8, w=210, h=7); (x=0, y=15, w=221, h=18); (x=0, y=33, w=210, h=6); >] [valid=< (x=0, y=8, w=210, h=7); (x=0, y=15, w=221, h=18); (x=0, y=33, w=210, h=6); >]
    D3D10ThebesLayer (0xf41fef0) [transform=[ 1 0; 0 1; 0 8; ]] [visible=< (x=1164, y=0, w=103, h=18); >] [valid=< (x=1164, y=0, w=103, h=18); >]

The first two and last two thebes layers in the root container layer overlap.

    D3D10ThebesLayer (0xbaf3358) [visible=< (x=0, y=8, w=1270, h=71); >] [valid=< (x=0, y=8, w=1270, h=71); >]
    D3D10ThebesLayer (0xf41f380) [visible=< (x=0, y=0, w=1270, h=37); >] [valid=< (x=0, y=0, w=1270, h=37); >]
    D3D10ThebesLayer (0xf41fd08) [visible=< (x=0, y=8, w=210, h=7); (x=0, y=15, w=221, h=18); (x=0, y=33, w=210, h=6); >] [valid=< (x=0, y=8, w=210, h=7); (x=0, y=15, w=221, h=18); (x=0, y=33, w=210, h=6); >]
    D3D10ThebesLayer (0xf41fef0) [transform=[ 1 0; 0 1; 0 8; ]] [visible=< (x=1164, y=0, w=103, h=18); >] [valid=< (x=1164, y=0, w=103, h=18); >]

The 2nd and 4th layers each contain only a single display item; in each case a single themed background. Not sure why this goes away if we wait using setTimeout.
Sounds like the only way to debug this properly is to someone to reproduce this locally and
debug what happens before timeouts run.
Attached patch win only fixSplinter Review
About to test this locally (on linux), and will push to try
Attachment #8345439 - Attachment is obsolete: true
Attachment #8345439 - Flags: review?(tnikkel)
Attachment #8350807 - Flags: review?(tnikkel)
Comment on attachment 8350807 [details] [diff] [review]
win only fix

Wish we had the manpower and time to look into exactly what is going on here, but not worth holding back the favor perf stuff.
Attachment #8350807 - Flags: review?(tnikkel) → review+
Thanks. Seems to work on linux. Pushing to try...
Another try push to try to get some more info
https://tbpl.mozilla.org/?tree=Try&rev=65495d31698e
It looks like a (margin) top, left, bottom, or right attribute is getting modified on maximize, causing the layer system to treat it as animating, thus moving these items their own layers for ~100ms, before that activity times out and the layers system no longer thinks those elements are animating and we get the expected layer tree.
https://hg.mozilla.org/mozilla-central/rev/eaf1557bb572
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
So it looks like when we maximize the browser window some code in browser.js around http://mxr.mozilla.org/mozilla-central/source/browser/base/content/browser.js#4550 modifies the margin-bottom of some titlebar related elements. Which makes them animated geometry roots until they expire about 100ms later. Unless there is a way to have the window already maximized when it opens (I didn't find one in a quick look) then we will just have to wait a little after maximizing the window, I don't think there is anything that we can listen for. So the setTimeout fix looks to be our best option.
Thanks for debugging this!
I would just poll until the test passes (layers don't overlap).
That is what the patch does.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: