Closed Bug 1142360 Opened 9 years ago Closed 9 years ago

Intermittent INFO TEST-UNEXPECTED-FAIL | test_bug570378-persian-4d.html test_bug570378-persian-4e.html test_bug570378-persian-4f.html,test_bug570378-persian-4g.html | reftest comparison: == ... (with bidi.numeral == 3 or 5) - expected PASS

Categories

(Core :: Layout: Text and Fonts, defect)

39 Branch
x86_64
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla39
Tracking Status
firefox37 --- fixed
firefox38 --- fixed
firefox39 --- fixed
firefox-esr31 --- unaffected
b2g-v1.4 --- fixed
b2g-v2.0 --- fixed
b2g-v2.1 --- fixed
b2g-v2.2 --- fixed
b2g-master --- fixed

People

(Reporter: KWierso, Assigned: ehsan.akhgari)

References

Details

(Keywords: intermittent-failure)

Attachments

(3 files)

15:28:57 INFO - 0 INFO SimpleTest START
15:28:58 INFO - 1 INFO TEST-START | layout/base/tests/test_bug1070851.html
15:29:11 INFO - 2 INFO TEST-OK | layout/base/tests/test_bug1070851.html | took 12612ms
15:29:13 INFO - 3 INFO TEST-START | layout/base/tests/test_bug1078327.html
15:29:24 INFO - 4 INFO TEST-OK | layout/base/tests/test_bug1078327.html | took 10874ms
15:29:25 INFO - 5 INFO TEST-START | layout/base/tests/test_bug1080360.html
15:29:30 INFO - 6 INFO TEST-OK | layout/base/tests/test_bug1080360.html | took 5436ms
15:29:31 INFO - 7 INFO TEST-START | layout/base/tests/test_bug1080361.html
15:29:37 INFO - 8 INFO TEST-OK | layout/base/tests/test_bug1080361.html | took 5403ms
15:29:37 INFO - 9 INFO TEST-START | layout/base/tests/test_bug1093686.html
15:29:44 INFO - 10 INFO TEST-OK | layout/base/tests/test_bug1093686.html | took 6185ms
15:29:44 INFO - 11 INFO TEST-START | layout/base/tests/test_bug570378-persian-4e.html
15:29:51 INFO - 12 INFO TEST-UNEXPECTED-FAIL | layout/base/tests/test_bug570378-persian-4e.html | Rendering of reftest bug570378-persian-4 is different with bidi.numeral == 4 - expected PASS
15:29:51 INFO - 13 INFO MEMORY STAT vsize after test: 105877504
15:29:51 INFO - 14 INFO MEMORY STAT residentFast after test: 43581440
15:29:51 INFO - 15 INFO MEMORY STAT heapAllocated after test: 21251992
15:29:51 INFO - 16 INFO TEST-OK | layout/base/tests/test_bug570378-persian-4e.html | took 7022ms
15:29:52 INFO - 17 INFO TEST-START | layout/base/tests/test_bug570378-persian-4f.html
15:29:57 INFO - 18 INFO TEST-OK | layout/base/tests/test_bug570378-persian-4f.html | took 4537ms
15:29:58 INFO - 19 INFO TEST-START | layout/base/tests/test_bug570378-persian-4g.html
15:30:03 INFO - 20 INFO TEST-OK | layout/base/tests/test_bug570378-persian-4g.html | took 5135ms
15:30:04 INFO - 21 INFO TEST-START | layout/base/tests/test_bug570378-persian-5a.html
15:30:08 INFO - 22 INFO TEST-OK | layout/base/tests/test_bug570378-persian-5a.html | took 3949ms
15:30:08 INFO - 23 INFO TEST-START | layout/base/tests/test_bug570378-persian-5b.html
15:30:13 INFO - 24 INFO TEST-OK | layout/base/tests/test_bug570378-persian-5b.html | took 4614ms
15:30:14 INFO - 25 INFO TEST-START | layout/base/tests/test_bug570378-persian-5c.html
15:30:19 INFO - 26 INFO TEST-OK | layout/base/tests/test_bug570378-persian-5c.html | took 4963ms
Flags: needinfo?(jfkthame)
Summary: Intermittent test_bug570378-persian-4e.html | Rendering of reftest bug570378-persian-4 is different with bidi.numeral == 4 - expected PASS → Intermittent test_bug570378-persian-4d.html,test_bug570378-persian-4e.html | Rendering of reftest bug570378-persian-4 is different with bidi.numeral == 4 - expected PASS
We should probably modify the mochitest so that it dumps images on failure.  (In fact, there should probably be shared code to do that if there isn't already.)
I think adding '--screenshot-on-fail' here:
https://dxr.mozilla.org/mozilla-central/source/testing/config/mozharness/b2g_emulator_config.py#61

will do the trick. Here's a try run that also removes --quiet (which hides subtest output from passing tests):
https://treeherder.mozilla.org/#/jobs?repo=try&revision=bac913488a11
Note that this adds a parameter to the callback function so that we can
keep track of which canvas is for the test and which is for the
reference; previously we didn't care.

It also puts testfile and reffile in local variables so that they can be
reused, and then appends the extra bidi numeral information to reffile
when calling assertSnapshots so that the messages contain the same (or
more) data as before.
Attachment #8577485 - Flags: review?(ehsan)
(I tested one of the tests locally, that it still passes unmodified, and fails if I stick an "x" in the test.)
(In reply to Andrew Halberstadt [:ahal] from comment #87)
> I think adding '--screenshot-on-fail' here:
> https://dxr.mozilla.org/mozilla-central/source/testing/config/mozharness/
> b2g_emulator_config.py#61
> 
> will do the trick. Here's a try run that also removes --quiet (which hides
> subtest output from passing tests):
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=bac913488a11

--screenshot-on-fail probably won't be useful for these tests because they change prefs and then change the prefs back, so by the time the screenshot is taken, the pref would be back to its default value and the display changed back correspondingly.
Attachment #8577485 - Flags: review?(ehsan) → review+
I suspect this isn't really about those tests "failing" at all, but rather it's an artifact of the overall test run timing out. (Perhaps screenshots will help to clarify that, once we get them.)

IIRC, the layout/base/tests/test_bug570378-* series of tests run particularly slowly, due to the pref-toggling they do, so if we're going to run into timeout issues, the odds are good that these are the tests where it'll happen.

As an experiment, I pushed a try job where I hacked the manifest to skip most of those tests; I left just a token handful of them enabled.[1] On this job, I got 9 out of 10 green M9 runs; for comparison, retriggering M9 runs on a nearby inbound changeset[2] gave 2 out of 10 green.

Unless/until we can significantly increase the 120-minute timeout, or arrange to split these tests across multiple chunks, I suggest that temporarily disabling most of them on b2g-emulator may be the best thing to do here.

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=bb1cd72d3176
[2] https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=8b0b98f81a14&filter-searchStr=emulator mochitest-9
Flags: needinfo?(jfkthame)
Nope, the failure is an actual visual failure -- and it's a failure to show the "broken image" icon, and showing the red dot in its place that we show if we can't load (or haven't yet loaded) the broken image icon.

(In my try run, I got a red dot for the test, and the broken image icon for the reference.)
I wonder if taking the snapshot right after the load event fires, as bidi_numeral_test.js does, is supposed to be sufficient.
Any thoughts on the red dot?  Is this something we've seen elsewhere?
Flags: needinfo?(seth)
(In reply to David Baron [:dbaron] (UTC-7) from comment #124)
> Any thoughts on the red dot?  Is this something we've seen elsewhere?

Probably the problem is that nsDisplayAltFeedback, being a display item that shows an image, needs a geometry item that inherits from nsImageGeometryMixin to be able to handle sync decoding correctly. Somehow it got overlooked when I updated all of the other display items that show images.

I'll file a bug about fixing it that blocks this bug.
Flags: needinfo?(seth)
Depends on: 1143267
I landed the test change on inbound:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5f1d4592207f
that will make these tests dump images.

Note that they also change the failure message, so that the failure message will now be:
TEST-UNEXPECTED-FAIL | layout/base/tests/test_bug570378-persian-4d.html | reftest comparison: == bug570378-persian-4.html bug570378-persian-4-ref.html (with bidi.numeral == 3) - expected PASS
Keywords: leave-open
Summary: Intermittent test_bug570378-persian-4d.html,test_bug570378-persian-4e.html | Rendering of reftest bug570378-persian-4 is different with bidi.numeral == 4 - expected PASS → Intermittent INFO TEST-UNEXPECTED-FAIL | test_bug570378-persian-4d.html test_bug570378-persian-4e.html test_bug570378-persian-4f.html | reftest comparison: == ... (with bidi.numeral == 3 or 5) - expected PASS
There are a bunch more instances of this bug that were incorrectly annotated as bug 951509.
The screenshots seem to suggest that sometimes we end up using a different icon for the unloaded image frame, with red borders and a red dot within it.  I have never seen that icon, and have no idea where it's coming from.
It's the fallback that we draw if the image code hasn't loaded the broken-image or loading-image icon.

Seth thinks the problem is bug 1143267; see above.
(In reply to :Ehsan Akhgari (not reading bugmail, needinfo? me!) from comment #172)
> The screenshots seem to suggest that sometimes we end up using a different
> icon for the unloaded image frame, with red borders and a red dot within it.
> I have never seen that icon, and have no idea where it's coming from.

See here:

https://dxr.mozilla.org/mozilla-central/source/layout/generic/nsImageFrame.cpp?from=nsImageFrame.cpp&case=true#1281
Seth, can you take this please?
Flags: needinfo?(seth)
(In reply to :Ehsan Akhgari (not reading bugmail, needinfo? me!) from comment #232)
> Seth, can you take this please?

I "took it" by implementing and landing bug 1143267.

Unfortunately, that didn't solve the problem as expected, which means the issue probably does not involve decoding the broken image icon but rather loading it.

This test is just taking a snapshot after the page load event, rather than waiting for invalidations to reach a fixed point, as the reftest harness does. This means that if the broken image icon isn't already loaded, sync decoding may not work, because loading the broken image icon doesn't block page load. In the reftest harness it would still work, because we'd generate a sync decode invalidation every time we took a snapshot until the image finishes loading.

I could investigate this in more depth and look at making it work - if my guess above is correct, I suppose that we could solve the problem by making icon loads block page load in every document - but I'm not convinced that's the simplest solution here.

Is there some reason these tests can't simply be reftests? I'd expect this problem to disappear instantly if they were.
Flags: needinfo?(seth) → needinfo?(dbaron)
The original reason why these are not reftests is that they needed to set a pref before loading the test cases, but I guess the reftest framework can do that these days?  Note that the pref also needs to be unset when we're done.  If someone can tell me how to do that, I may be able to do the grunt work.
Unlike most harnesses, where the documentation doesn't exist or is in the last place you would ever imagine, for reftests the only problem is remembering which directory http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/README.txt#169 lives in - the harness takes care of unsetting for you, you just tell the manifest the pref and the value.
Great, thanks, philor!  I'll try to get to this during the weekend...
Assignee: nobody → ehsan
Flags: needinfo?(ehsan)
This will help fix an intermittent test failure as explained in bug
1142360 comment 301.  It also has the additional benefit of making
things faster overall, since the reflows of the huge mochitest test
runner page every time the bidi.numeral pref changes are very expensive,
and such overhead doesn't exist in the reftest framework.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=eba05647e4ac
Attachment #8580501 - Flags: review?(dbaron)
Flags: needinfo?(ehsan)
Comment on attachment 8580501 [details] [diff] [review]
Move the mochitests for bugs 441782, 467672 and 570378 to the reftest framework

Please add an "include numeral/reftest.list" in layout/reftests/bidi/reftest.list so that we run the tests.

r=dbaron with that

It's probably worth testing that the output for the different pref values looks the way you expect by flipping a few of the tests to the opposite (== vs. !=) and looking at the images in reftest-analyzer.

Also, I don't know the history of the skip-if annotations in the mochitest.ini, but it's at least worth a try run to check that they're not needed anymore.  (Some of them don't explicitly say "timed out"... although maybe that's what they were?)
Attachment #8580501 - Flags: review?(dbaron) → review+
(In reply to David Baron [:dbaron] (UTC-7) from comment #417)
> Comment on attachment 8580501 [details] [diff] [review]
> Move the mochitests for bugs 441782, 467672 and 570378 to the reftest
> framework
> 
> Please add an "include numeral/reftest.list" in
> layout/reftests/bidi/reftest.list so that we run the tests.

Oops!

> It's probably worth testing that the output for the different pref values
> looks the way you expect by flipping a few of the tests to the opposite (==
> vs. !=) and looking at the images in reftest-analyzer.

Already did locally.

> Also, I don't know the history of the skip-if annotations in the
> mochitest.ini, but it's at least worth a try run to check that they're not
> needed anymore.  (Some of them don't explicitly say "timed out"... although
> maybe that's what they were?)

I'll look again, but I think they've all been disabled because of timeouts caused by the expensive reflows.
Thanks for switching these over to reftests, Ehsan!

Please needinfo me if we're still running into problems once that change lands, but I'm pretty sure they will work just fine as reftests.
Summary: Intermittent INFO TEST-UNEXPECTED-FAIL | test_bug570378-persian-4d.html test_bug570378-persian-4e.html test_bug570378-persian-4f.html | reftest comparison: == ... (with bidi.numeral == 3 or 5) - expected PASS → Intermittent INFO TEST-UNEXPECTED-FAIL | test_bug570378-persian-4d.html test_bug570378-persian-4e.html test_bug570378-persian-4f.html,test_bug570378-persian-4g.html | reftest comparison: == ... (with bidi.numeral == 3 or 5) - expected PASS
The failures were:

REFTEST TEST-UNEXPECTED-FAIL | file:///C:/slave/test/build/tests/reftest/tests/layout/reftests/bidi/numeral/bug441782-3.html | image comparison (==), max difference: 13, number of differing pixels: 1
REFTEST TEST-UNEXPECTED-FAIL | file:///C:/slave/test/build/tests/reftest/tests/layout/reftests/bidi/numeral/bug441782-3.html | image comparison (==), max difference: 13, number of differing pixels: 1
REFTEST TEST-UNEXPECTED-FAIL | file:///C:/slave/test/build/tests/reftest/tests/layout/reftests/bidi/numeral/bug441782-3.html | image comparison (==), max difference: 13, number of differing pixels: 1 


Note that I annotated them as fails-if() rather than fuzzy-if() so that we'll catch the problem going away.  (I wish fuzzy-if took ranges rather than implicitly treating the values given as a maximum fuzziness so that fuzzy-if could produce unexpected passes.)
Hi Ehsan, there seems to be a consistent M12 bustage on B2G ICS Emulator since this push. Here are the logs - https://treeherder.mozilla.org/logviewer.html#?job_id=7894860&repo=mozilla-inbound

I've retriggered the green in the push before this just to confirm it's from the push. Would you be able to take a look today? I'll back it out tomorrow morning if nothing works out :)
Flags: needinfo?(ehsan)
(In reply to nigelbabu@gmail.com [:nigelb] from comment #443)
> Hi Ehsan, there seems to be a consistent M12 bustage on B2G ICS Emulator
> since this push. Here are the logs -
> https://treeherder.mozilla.org/logviewer.html#?job_id=7894860&repo=mozilla-
> inbound

So this has existed before my patch as bug 1099195.  This seems like a weird chunking issue caused by my patch removing stuff from the mochitest suite and into the reftest suite, therefore running different tests in every chunk.  Unfortunately I won't have time to chase that particular goose, so feel free to back me out, and then we'll need to live with this orange unless if someone is willing to investigate this further.
Flags: needinfo?(ehsan)
If this is the usual M12 timeouts, no need to backout. We'll be fixing them by adding more chunks once they're switched over to TC in the near future.
It's mostly the usual M12 timeouts, but with an additional variant described in bug 1146092, which is where I've been tracking the regression.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: