Closed Bug 688128 Opened 13 years ago Closed 3 years ago

Intermittent failure in test_flush_on_paint.html: Check that we set width before painting - got [n], expected [n+1], where n>=200

Categories

(Core :: Layout, defect, P3)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox87 --- fixed

People

(Reporter: graememcc, Assigned: philor)

References

Details

(Keywords: intermittent-failure, Whiteboard: [test disabled on mac])

Attachments

(1 file)

First seen this with https://hg.mozilla.org/integration/mozilla-inbound/rev/ef007a375b34

https://tbpl.mozilla.org/php/getParsedLog.php?id=6474946&tree=Mozilla-Inbound
Rev3 WINNT 5.1 mozilla-inbound debug test mochitests-4/5 on 2011-09-20 16:48:19 PDT 
799 ERROR TEST-UNEXPECTED-FAIL | /tests/layout/base/tests/test_flush_on_paint.html | Check that we set width before painting - got 200, expected 201

https://tbpl.mozilla.org/php/getParsedLog.php?id=6477113&tree=Mozilla-Inbound
Rev3 MacOSX Snow Leopard 10.6.2 mozilla-inbound opt test mochitests-4/5 on 2011-09-20 19:32:36 PDT
1198 ERROR TEST-UNEXPECTED-FAIL | /tests/layout/base/tests/test_flush_on_paint.html | Check that we set width before painting - got 202, expected 203
Could be coincidence, but that seems like a slight uptick (from once every three months to five in one day), around the time that bug 598482 landed.
Blocks: 598482
I really doubt this is a coincidence.

The structure of the testcase is as follows:

1)  There's a timer that tries to trigger an invalidate via the plug-in every 20ms.  With the changes in bug 598482 that means the gap between invalidations due to this code will be between 16ms and 36ms depending on how that timer's firing times compare to the refresh driver firing times.  Before that patch it should have been about 20ms consistently, modulo timer issues.

2)  The doIteration function saves the mozPaintCount, sets the expectedWidth to 201 + iterations (where iterations starts at 0), sets the plug-in width to expectedWidth, and triggers checkDone.

3)  checkDone checks every 30ms to see whether the mozPaintCount has incremented.  If it has, it asserts that the plugin's getWidthAtLastPaint() is equal to the expectedWidth, then increments |iterations| and calls doIteration, unless we've already had 100 iterations, in which case it stops the test.

The failure we're seeing is that getWidthAtLastPaint() is one smaller than the expectedWidth.

By assumption the expectedWidth is equal to the plugin.style.width when lastPaintCount was saved.  If we're doing our "check that we set width" check, that means the mozPaintCount has increased since then.  For mozPaintCount to increase, either nsDisplayListBuilder::EnterPresShell or PresShell::Paint need to be called.

As far as getWidthAtLastPaint(), I assume that this returns the width the last time the plug-in was asked to paint.  I'm not actually sure what records this on Mac, but I'll assume something does.

The simplest way for the test to fail is or a paint to happen that flushes layout and updates mozPaintCount but does NOT paint the plug-in.  Can that happen?  Presumably the invalidationLoop timer is supposed to prevent it.  But with the new setup, I think what can happen is this:

1)  t=0 invalidationLoop timer fires, queues up invalidation on the view
2)  t=0 refresh driver fires, flushes that invalidation
3)  t=0 paint, increment mozPaintCount and update getWidthAtLastPaint
4)  t=0: We enter doIteration, change width, call checkDone, set 30ms timeout to call
    checkDone again.
5)  t=16 refresh driver fires, does reflow, processes invalidates for the reflow. 
    Flushes widget geometry?  Not sure about that part.
6)  t=16 paint the area invalidated in comment 5, which for some reason does not trigger
    a paint on the plug-in.  Increment mozPaintCount but do NOT update
    getWidthAtLastPaint
3)  t=20 invalidationLoop timer fires, queues up invalidation on the view
4)  t=30 checkDone timer fires, detects that expected width (set in step 4) doesn't
    match getWidthAtLastPaint (set in step 3).

So the problem in that event sequence is that you can get two refresh driver firings in between two invalidationLoop calls and that you can get two checkDone calls with only one refresh driver firing between them (the second one of the two firings between invalidationLoop calls).  As a result you get no invalidation due to the invalidationLoop between the two checkDone calls.

If I'm right, then we should either make the invalidationLoop timer shorter than the refresh timer or make the checkDone timer more than twice as long as the refresh driver timer.  Or both.

Robert, thoughts?  Am I completely off-base here in terms of how this could happen?  

Alternately, if I'm right about what's happening, could we expose the number of paints of the _plug-in_ on the test plugin-in and use that instead of mozPaintCount?  Or would that no longer test what we want to test here?
(In reply to Boris Zbarsky (:bz) from comment #9)
> 5)  t=16 refresh driver fires, does reflow, processes invalidates for the
> reflow. 
>     Flushes widget geometry?  Not sure about that part.
> 6)  t=16 paint the area invalidated in comment 5, which for some reason does
> not trigger
>     a paint on the plug-in.

This would be a bug, wouldn't it?
> This would be a bug, wouldn't it?

Possibly, yes.  I have no idea what's going on there.  If I could at least reproduce this locally I could try logging the heck out of it, but I haven't managed to reproduce this so far.
Summary: Intermittent failure in /tests/layout/base/tests/test_flush_on_paint.html: Check that we set width before painting - got [n], expected [n+1], where n>=200 → Intermittent failure in test_flush_on_paint.html: Check that we set width before painting - got [n], expected [n+1], where n>=200
Attached patch stop itSplinter Review
For the first 1000 instances, I still felt like I was learning something, but after that, I just don't think we're learning anything new.
Attachment #625352 - Flags: review?(roc)
https://hg.mozilla.org/integration/mozilla-inbound/rev/9a684c3344a5
No longer blocks: 438871, 598482
Whiteboard: [orange] → [orange][leave open]
Blocks: 438871, 598482
Whiteboard: [orange][leave open] → [orange][test disabled on mac]
Whiteboard: [orange][test disabled on mac] → [test disabled on mac]
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
These failures are still around. The test also hits "TypeError: plugin.setColor is not a function at invalidationLoop@http://mochi.test:8888/tests/layout/base/tests/test_flush_on_paint.html:29:3" nowadays.

I'm removing this test in bug 1687239 part 1.

Status: NEW → RESOLVED
Closed: 3 years ago
Depends on: 1687239
Resolution: --- → FIXED
Assignee: nobody → philringnalda
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: