Open Bug 1392106 (missing-letter-win7) Opened 2 years ago Updated 2 days ago

Intermittent letter missing in reftests on Windows 7

Categories

(Core :: Graphics: Text, defect, P3, major)

x86
Windows 7
defect

Tracking

()

REOPENED

People

(Reporter: aryx, Unassigned)

References

(Blocks 2 open bugs)

Details

(Keywords: leave-open, Whiteboard: [gfx-noted][PI:June][stockwell needswork:owner])

Attachments

(10 files, 5 obsolete files)

207.93 KB, patch
Details | Diff | Splinter Review
59 bytes, text/x-review-board-request
kats
: review+
Details
59.80 KB, patch
RyanVM
: review+
Details | Diff | Splinter Review
42.79 KB, patch
RyanVM
: review+
Details | Diff | Splinter Review
107.63 KB, patch
RyanVM
: review+
Details | Diff | Splinter Review
81.00 KB, patch
RyanVM
: review+
Details | Diff | Splinter Review
34.02 KB, patch
RyanVM
: review+
Details | Diff | Splinter Review
64.60 KB, patch
RyanVM
: review+
Details | Diff | Splinter Review
46 bytes, text/x-phabricator-request
RyanVM
: review+
Details | Review
46 bytes, text/x-phabricator-request
jmaher
: review+
Details | Review
I like https://treeherder.mozilla.org/logviewer.html#?job_id=124355181&repo=mozilla-inbound&lineNumber=8536 better:

00:01:35     INFO -  Could not create glyph run analysis.
00:01:35     INFO -  z:/build/build/src/gfx/skia/skia/src/ports/SkScalerContext_win_dw.cpp(508) : error 0x80070005: Access is denied.
00:01:35     INFO -  Requested bounding box could not be determined.
00:01:35     INFO -  z:/build/build/src/gfx/skia/skia/src/ports/SkScalerContext_win_dw.cpp(583) : error 0x80070005: Access is denied.
00:01:35     INFO -  REFTEST TEST-LOAD | file:///Z:/task_1503186776/build/tests/reftest/tests/layout/reftests/w3c-css/received/css-writing-modes-3/reference/bidi-normal-011.html | 720 / 1944 (37%)
00:01:35     INFO -  ++DOMWINDOW == 62 (0B97B400) [pid = 3876] [serial = 1690] [outer = 09CB0400]
00:01:35    ERROR -  REFTEST TEST-UNEXPECTED-FAIL | file:///Z:/task_1503186776/build/tests/reftest/tests/layout/reftests/w3c-css/received/css-writing-modes-3/bidi-normal-011.html == file:///Z:/task_1503186776/build/tests/reftest/tests/layout/reftests/w3c-css/received/css-writing-modes-3/reference/bidi-normal-011.html | image comparison, max difference: 255, number of differing pixels: 192
Component: Layout → Graphics: Text
OS: Unspecified → Windows 7
Hardware: Unspecified → x86
Summary: Intermittent letter missing in reftests on Windows → Intermittent letter missing in reftests on Windows 7
Blocks: 1389360
Blocks: 1383450
Blocks: 1386083
Blocks: 1383476
Blocks: 1392107
Bleah, comment 1 and probably some number of my dependencies actually look like a separate failure, where we don't load a web font. Not going to be fun to tell apart.
Jonathan, you might have ideas here?
Flags: needinfo?(bas) → needinfo?(jfkthame)
The failures here seem very bizarre, with just an individual glyph disappearing when other glyphs in the same font are fine. So this isn't a whole-font failure (like the ones we're currently seeing where a webfont apparently hasn't loaded, and we get a fallback instead); it's at the level of individual glyphs.

Maybe some kind of resource constraint deep in the graphics system whereby a rasterized glyph gets evicted from a cache, and not reliably regenerated in time for when we paint it? Or an entry in a glyph cache is getting corrupted somehow?

Really, I have no idea, sorry.
Flags: needinfo?(jfkthame)
Whiteboard: [gfx-noted]
Summary: Intermittent letter missing in reftests on Windows 7 → Intermittent first letter in element missing in reftests on Windows 7
Bug 1407762 has the "i" in "with" missing.
Summary: Intermittent first letter in element missing in reftests on Windows 7 → Intermittent letter missing in reftests on Windows 7
Blocks: 1436946
Blocks: 1436942
Looking at logs from a couple of recent examples of this (bug 1436942, bug 1436946), I notice they have one thing in common. This is from bug 1436946 (https://public-artifacts.taskcluster.net/dsEW2cDjTUGux7eF5VN2cA/0/public/logs/live_backing.log):

07:57:58     INFO -  REFTEST TEST-START | file:///Z:/task_1518162077/build/tests/reftest/tests/layout/reftests/text/graphite-02.html == file:///Z:/task_1518162077/build/tests/reftest/tests/layout/reftests/text/graphite-02-ref.html
07:57:58     INFO -  REFTEST TEST-LOAD | file:///Z:/task_1518162077/build/tests/reftest/tests/layout/reftests/text/graphite-02.html | 548 / 1964 (27%)
07:57:58     INFO -  ++DOMWINDOW == 86 (0DB75C00) [pid = 1452] [serial = 1440] [outer = 062394C0]
07:57:58     INFO -  REFTEST TEST-LOAD | file:///Z:/task_1518162077/build/tests/reftest/tests/layout/reftests/text/graphite-02-ref.html | 548 / 1964 (27%)
07:57:58     INFO -  ++DOMWINDOW == 87 (0DB7BC00) [pid = 1452] [serial = 1441] [outer = 062394C0]
07:57:58     INFO -  Could not create glyph run analysis.
07:57:58     INFO -  z:/build/build/src/gfx/skia/skia/src/ports/SkScalerContext_win_dw.cpp(508) : error 0x80070005: Access is denied.
07:57:58     INFO -  Requested bounding box could not be determined.
07:57:58     INFO -  z:/build/build/src/gfx/skia/skia/src/ports/SkScalerContext_win_dw.cpp(583) : error 0x80070005: Access is denied.
07:57:58     INFO -  REFTEST INFO | REFTEST fuzzy test (0, 0) <= (255, 211) <= (49, 220)
07:57:58    ERROR -  REFTEST TEST-UNEXPECTED-FAIL | file:///Z:/task_1518162077/build/tests/reftest/tests/layout/reftests/text/graphite-02.html == file:///Z:/task_1518162077/build/tests/reftest/tests/layout/reftests/text/graphite-02-ref.html | image comparison, max difference: 255, number of differing pixels: 211


See the two errors logged by Skia there. That's why there is a glyph missing: a couple of DirectWrite calls that Skia makes have failed with "error 0x80070005: Access is denied", and the end result is that a glyph is not painted.

Note that the same font is being used to draw other glyphs that appear fine.

Can we somehow determine what is going wrong here?
Flags: needinfo?(milan)
Flags: needinfo?(lsalzman)
Flags: needinfo?(bas)
Blocks: 1433606
Blocks: 1436609
Blocks: 1435987
Blocks: 1392116
Blocks: 1436986
Blocks: 1436989
Blocks: 1436990
(In reply to Jonathan Kew (:jfkthame) from comment #23)
> Looking at logs from a couple of recent examples of this (bug 1436942, bug
> 1436946), I notice they have one thing in common. This is from bug 1436946
> (https://public-artifacts.taskcluster.net/dsEW2cDjTUGux7eF5VN2cA/0/public/
> logs/live_backing.log):
> 
> 07:57:58     INFO -  REFTEST TEST-START |
> file:///Z:/task_1518162077/build/tests/reftest/tests/layout/reftests/text/
> graphite-02.html ==
> file:///Z:/task_1518162077/build/tests/reftest/tests/layout/reftests/text/
> graphite-02-ref.html
> 07:57:58     INFO -  REFTEST TEST-LOAD |
> file:///Z:/task_1518162077/build/tests/reftest/tests/layout/reftests/text/
> graphite-02.html | 548 / 1964 (27%)
> 07:57:58     INFO -  ++DOMWINDOW == 86 (0DB75C00) [pid = 1452] [serial =
> 1440] [outer = 062394C0]
> 07:57:58     INFO -  REFTEST TEST-LOAD |
> file:///Z:/task_1518162077/build/tests/reftest/tests/layout/reftests/text/
> graphite-02-ref.html | 548 / 1964 (27%)
> 07:57:58     INFO -  ++DOMWINDOW == 87 (0DB7BC00) [pid = 1452] [serial =
> 1441] [outer = 062394C0]
> 07:57:58     INFO -  Could not create glyph run analysis.
> 07:57:58     INFO - 
> z:/build/build/src/gfx/skia/skia/src/ports/SkScalerContext_win_dw.cpp(508) :
> error 0x80070005: Access is denied.
> 07:57:58     INFO -  Requested bounding box could not be determined.
> 07:57:58     INFO - 
> z:/build/build/src/gfx/skia/skia/src/ports/SkScalerContext_win_dw.cpp(583) :
> error 0x80070005: Access is denied.
> 07:57:58     INFO -  REFTEST INFO | REFTEST fuzzy test (0, 0) <= (255, 211)
> <= (49, 220)
> 07:57:58    ERROR -  REFTEST TEST-UNEXPECTED-FAIL |
> file:///Z:/task_1518162077/build/tests/reftest/tests/layout/reftests/text/
> graphite-02.html ==
> file:///Z:/task_1518162077/build/tests/reftest/tests/layout/reftests/text/
> graphite-02-ref.html | image comparison, max difference: 255, number of
> differing pixels: 211
> 
> 
> See the two errors logged by Skia there. That's why there is a glyph
> missing: a couple of DirectWrite calls that Skia makes have failed with
> "error 0x80070005: Access is denied", and the end result is that a glyph is
> not painted.
> 
> Note that the same font is being used to draw other glyphs that appear fine.
> 
> Can we somehow determine what is going wrong here?

I wonder if this could be somehow related to sandboxing.
Flags: needinfo?(bas)
Blocks: 1437304
Blocks: 1437296
All we know given the error is that it originates from a call to either GetGlyphRunAnalysis OR GetAlphaTextureBounds as per here: https://dxr.mozilla.org/mozilla-central/source/gfx/skia/skia/src/ports/SkScalerContext_win_dw.cpp#453

Unfortunately the MSDN documentation on is useless and tells nothing about why this function might return an "Access denied" error here. So it's either possibly a very strange way of signaling an OOM, or maybe as Bas suggests, we're trying to load a separate font file somehow for the character(s) that go missing, and perhaps sandboxing is getting in the way.

Given that our Windows 7 test infrastructure is already suspect on intermittents in many cases for OOMs, and this seems to be a Windows 7 only thing, I couldn't really tell which is more likely here.
Flags: needinfo?(lsalzman)
I did briefly wonder about sandboxing, but I don't see any reason why we'd suddenly need to access a different file just for a single glyph in a common font like Times or Arial. Unless the sandbox itself is flaky, in the sense that it occasionally blocks an access that normally would be permitted, I don't think it can be related to that.

Maybe it's triggered by near-OOM conditions or some other kind of resource exhaustion (which I suspect may be what's going on in bug 1430572, too). Can we do anything to provide more resources to Win7 test machines, or should we just give up on them? The rate of these Graphics intermittents is pretty bad....
Blocks: 1437395
Blocks: 1392113
Blocks: 1390365
Blocks: 1410613
Blocks: 1387288
Blocks: 1387283
Blocks: 1437916
Blocks: 1388064
Blocks: 1394838
Blocks: 1423023
Blocks: 1438753
Blocks: 1438807
Blocks: 1394259
Blocks: 1439297
Duplicate of this bug: 1433606
Blocks: 1440170
Blocks: 1416800
Blocks: 1392118
Most of these failures end up getting starred as separate bugs for the individual test that happened to get unlucky, which isn't really very helpful -- the failure is unrelated to the specific test, and can hit almost anywhere that text gets rendered.

To get a better sense of how much this is happening, would it be possible for OrangeFactor to report a total number of failures for the bug _and_ all its dupes/dependencies? Ed, Geoff, is something like that available? (Not really sure who to n-i, but maybe you guys know about this stuff...)
Flags: needinfo?(gbrown)
Flags: needinfo?(emorley)
Hi! Bug 1299274 is filed for that.

(Un-CCing from here to reduce bot/dependency spam; let's continue any discussions over in bug 1299274 :-))
Flags: needinfo?(emorley)
Flags: needinfo?(gbrown)
Blocks: 1441640
Blocks: 1442166
Blocks: 1407506
Blocks: 1442175
No longer depends on: 1442175
Blocks: 1395036
Blocks: 1442639
Blocks: 1404898
Blocks: 1405183
See Also: → 1449996
I am not clear who should work on this- there are a LOT of intermittents every week as a result of this bug.  I believe this is a graphics issue and we would need someone from :milan's team to look into this.  If it isn't please call it out so we can get the right team looking at this bug.  8 months with no ownership of this bug, I think we need to disable tests, hide them, or fix the tests/product- ideally fix the root cause.
Flags: needinfo?(jfkthame)
Whiteboard: [gfx-noted] → [gfx-noted][PI:April]
Yes, I believe this is a graphics issue (as per comment 25 and following), and a solution would need to come from the Gfx team figuring out what we can do to mitigate the resource pressure, or whatever is failing.

Or we could give up testing (at least of text drawing) on Win7.

If DirectWrite is so unreliable on our test infrastructure, such that we get this rate of intermittent failures to draw some of the text, should we really be shipping it to users? Or would they be better served by just disabling DW on Win7 and using the old GDI font code? Milan...?
Flags: needinfo?(jfkthame)
(In reply to Jonathan Kew (:jfkthame) from comment #41)
> Yes, I believe this is a graphics issue (as per comment 25 and following),
> and a solution would need to come from the Gfx team figuring out what we can
> do to mitigate the resource pressure, or whatever is failing.
> 
> Or we could give up testing (at least of text drawing) on Win7.
> 
> If DirectWrite is so unreliable on our test infrastructure, such that we get
> this rate of intermittent failures to draw some of the text, should we
> really be shipping it to users? Or would they be better served by just
> disabling DW on Win7 and using the old GDI font code? Milan...?

IIRC, Mason was the last person to investigate this, and again, IIRC, it was something to do with the GC not happening soon enough and OOMing because of that, where essentially, there wasn't any great solution for it. So this was not so much an issue with DWrite, in that you're likely to just run into other weird failures even if you sub it out with GDI. It's just in this game of hot-potato, DWrite is the one holding the potato when the music stops.
would adding a "slow" mode to reftest for these tests on win7 fix the issue?  worth a try, but requires a bit of hacking and work- if it sounds reasonable I could do that.
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #43)
> would adding a "slow" mode to reftest for these tests on win7 fix the issue?
> worth a try, but requires a bit of hacking and work- if it sounds reasonable
> I could do that.

If it would plausibly solve the issue, it would be worth trying.

Unfortunately, Mason I think is the only one who could give a definitive answer there, and he is no longer here. Though we might be able to contact him through Milan.
looking over all the bugs that are blocked by this, it seems that we have an equal distribution of opt/pgo/debug or 1/3 each.  Some happen predominately on debug, others more so on opt/pgo.

I pushed to try with a 250ms delay between tests, unfortunately many of the jobs hit the maxruntime (I forgot to adjust that), but just barely and the tests actually ran!

There are 23 real test failures (the ones with stars) in the 600 jobs (20x for all jobs) that ran, and 3 of those are not related to this bug.  

What is interesting is most of the failures are on debug chunk 4 (layout/reftests/w3c-css/received/css-writing-modes/baseline-inline-non-replaced-00*.xht), I see very few of the other failures that we typically see related to this bug.

I believe that adding a --slow-mode in for win7 reftests would reduce the amount of failures we see significantly.  I thought about what directories are most problematic, and I believe it is:
font-features/
font-face/
font-matching/
font-inflation/
svg/text/
text/
text-transform/
w3c-css/received/css-writing-modes/
w3c-css/received/css-writing-modes-3/

while there are other failures, I believe the above directories account for the large majority

to avoid more problems, maybe we move those 9 directories into their own reftest-text job and consider running it in slow-mode on win7 and regular mode elsewhere.

:jfkthame, any other things we should consider?  does this sound reasonable?
Flags: needinfo?(jfkthame)
and an example try push to see what things could look like by splitting out fonts:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=404b295f3b0754baee6a60ce0a49fb5b70f8e5b8
Blocks: 1454270
(In reply to Joel Maher ( :jmaher - limited bugzilla access until May 1st) (UTC+2) from comment #45)
> to avoid more problems, maybe we move those 9 directories into their own
> reftest-text job and consider running it in slow-mode on win7 and regular
> mode elsewhere.
> 
> :jfkthame, any other things we should consider?  does this sound reasonable?

If it'll help cut down the failure rate, that seems worthwhile to me. How badly does "slow-mode" increase the overall time it'll take those test jobs to complete, and/or the cost of running them?
Flags: needinfo?(jfkthame)
Duplicate of this bug: 1457754
Duplicate of this bug: 1455812
Whiteboard: [gfx-noted][PI:April] → [gfx-noted][PI:May]
Duplicate of this bug: 1458451
See Also: → 1458506
I will go forward with working on splitting out the 9 directories into reftest-fonts and having them run slower on win7 only.  The runtime hit will be about 3 minutes/chunk for slow mode and 5 minutes/chunk for job overhead., so 20 chunks or 1 hour for a full set of tests.  It is hard to say what that translates to over time, I would estimate we would run about 7 chunks on average/push, so this would be 8 minutes * 7 jobs = 56 minutes/push.  If we round to 1 hour, then we are looking at ~850 hours/week of additional cpu time or 44,200 hours of cpu time.  I am not sure of the cost/hour for these.

Another solution would be to split these out and for win7 only run them on m-c/m-b/try so we continue to get coverage, but reduce the overhead.  That would be about 120 hours/week of additional load (or 6200+ for the year)

Given the volume of duplicates here, we are probably seeing 100+ failures/week and we can reduce it down to <10/week if we run it everywhere, or reduced to m-c/m-b/try it would be only a couple failures/week!

:jfkthame, any additional thoughts?
:gbrown, any thoughts or preferences?
Flags: needinfo?(milaninbugzilla)
Flags: needinfo?(jfkthame)
Flags: needinfo?(gbrown)
I don't have a better idea.
Flags: needinfo?(gbrown)
I was planning to aggressively random-if them on ESR60, but maybe we should just do that across the board?
I would prefer to not split these on all platforms, but that would be confusing.  I can at least rebase my patches and get the uploaded, cleaned up.
I don't have any better ideas either. (But I'll be glad to see the flood of intermittents brought under control - thanks!)
Flags: needinfo?(jfkthame)
What Tier level is this new reftest-font likely to be?
if we do not do the slow, I would say tier-2; if we do make it run slower (250ms delay between new loads) then we are looking at tier-1 still.
The "Platform breakdown" in comment 70:

> * windows7-32: 208
> * windows7-32-nightly: 9
> * android-4-3-armv7-api16: 3
> * windows10-64-qr: 1
> * windows10-64: 1
> * osx-10-10: 1

indicates that we've started incorrectly starring some random failures on other platforms as this bug. :-\ (Actually, there have been a few occurrences previously, too.)
not sure if I should remove the references from the main reftest.list or not- Happy to do that, or leave it as is.

Here is a try run:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=bc0caef1884ab03e369f5571c44f68800e21b9e5
Assignee: nobody → jmaher
Status: NEW → ASSIGNED
Attachment #8973649 - Flags: review?(gbrown)
Attached patch run win7 fonts in slow mode (obsolete) — Splinter Review
see previous patch for try link.  I am not happy with --run-slow, but I cannot think of a better name.
Attachment #8973650 - Flags: review?(ahalberstadt)
for reference to runtime, I looked at the try push from comment 72 and compared it to the same root revision without my patches.  In total baseline had:
91 jobs (reftest*, not jsreftest)
2311 seconds average runtime/job (10 data points/retriggers per job)

and with the patches we have:
122 jobs (reftest*, not jsreftest)
2398 seconds average runtime/job (10 data points/retriggers per job)

this is <2 minutes added runtime, we are burning time in setup with the additional 31 jobs, but overall that doesn't impact the cost of running the tests.
Comment on attachment 8973649 [details] [diff] [review]
divide reftests into regular and fonts

Review of attachment 8973649 [details] [diff] [review]:
-----------------------------------------------------------------

I am concerned that reftest_writing_modes.list may not be run.

Otherwise, just nits/thoughts -- nothing serious.

::: layout/reftests/reftest.list
@@ +193,5 @@
>  # floats/
>  include floats/reftest.list
>  
> +###################################
> +# These are in reftest_fonts.list

Maybe "These tests have been moved to reftest_fonts.list"?

::: layout/reftests/reftest_fonts.list
@@ +1,1 @@
> +# font-face

Consider adding a comment at the top of this file saying why these tests have been split out of the normal manifests: Provide guidance to someone adding a new test directory, should it go in reftest_font or not?

::: layout/reftests/svg/reftest.list
@@ +12,5 @@
>  
>  # smil / animation tests
>  include smil/reftest.list
>  
> +# in reftest_fonts.list

Make this comment the same as in the other reftest.list files.

::: layout/reftests/w3c-css/received/reftest_writing_modes.list
@@ +1,1 @@
> +== css-writing-modes/abs-pos-non-replaced-icb-vlr-003.xht reference/ref-filled-green-100px-square.xht

What's happening here, where is this manifest linked in?

Consider adding a header comment also.

::: taskcluster/ci/test/reftest.yml
@@ +150,5 @@
> +            default: virtual-with-gpu
> +    chunks:
> +        by-test-platform:
> +            android-4.3-arm7-api-16/debug: 28
> +            android.*: 14

From the try run, it looks like android/opt could run safely in half as many chunks and still have comparable runtimes to regular reftests.

::: testing/mozharness/mozharness/mozilla/building/buildbase.py
@@ +209,5 @@
>                                                  levels=TBPL_WORST_LEVEL_TUPLE)
>  
>          # Account for the possibility that no test summary was output.
>          if self.pass_count == 0 and self.fail_count == 0:
> +            self.error('buildbase.py: No tests run or test summary not found')

This may "shift" some bugs (like 1406799) to a new error message. Consider not changing the error message, or comment on the potentially-affected bugs.
Attachment #8973649 - Flags: review?(gbrown) → review+
Comment on attachment 8973650 [details] [diff] [review]
run win7 fonts in slow mode

Review of attachment 8973650 [details] [diff] [review]:
-----------------------------------------------------------------

Lgtm, r+ with issues fixed.

::: layout/tools/reftest/reftestcommandline.py
@@ +253,5 @@
>                            type=int,
>                            default=3600,
>                            help="Maximum time, in seconds, to run in --verify mode..")
>  
> +        self.add_argument("--run-slow",

In mochitest it's called --run-slower, so let's make it consistent with that.

::: taskcluster/taskgraph/transforms/tests.py
@@ +870,5 @@
> +@transforms.add
> +def enable_webrender(config, tests):
> +    """
> +    Win7 needs time to render fonts, so for the reftest-fonts, add --run-slow
> +    """

Looks like the docstrings here got mixed up.

::: testing/mozharness/scripts/desktop_unittest.py
@@ +174,5 @@
>              "dest": "gpu_required",
>              "default": "False",
>              "help": "Run additional verification on modified tests using gpu instances."}
>           ],
> +        [["--run-slow"], {

Ditto, let's call it --run-slower here too.
Attachment #8973650 - Flags: review?(ahalberstadt) → review+
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/95ccdb87f63f
add a --run-slow mode to reftest to add extra time for win7 font rendering. r=ahal
https://hg.mozilla.org/integration/mozilla-inbound/rev/cac593a84f72
split reftest fonts into seperate suite. r=gbrown
https://hg.mozilla.org/mozilla-central/rev/95ccdb87f63f
https://hg.mozilla.org/mozilla-central/rev/cac593a84f72
Status: ASSIGNED → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
on win7 debug I see a heavy failure rate for w3c-css/received/css-writing-modes/* tests.  It is always different tests, on debug, not opt or pgo.  because of this, we need to look into backing this out.  I had tested on opt for the majority of my testing, and sanity checked on debug- while I saw a failure, I didn't see the same failure repeat and many green jobs.

I would question the need for font tests on win7 in general- I know we want these, do we need reftest-fonts on win7/debug?  If so, I can come up with a solution.
Flags: needinfo?(jfkthame)
@jmaher: Note also that the w3c-css/received/reftest.list file is generated from the failures.list file in w3c-css/received/. By splitting it manually into two files the generation of it is no longer going to work. Also please let me know if you intend to back this out or not - I would like to land bug 1322845 which adds a bunch of annotations and it don't want to rebase it on top of this if this is going to be backed out.
(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #79)
> on win7 debug I see a heavy failure rate for
> w3c-css/received/css-writing-modes/* tests.  It is always different tests,
> on debug, not opt or pgo.

The vast majority of those tests are not really testing anything font-related, they're about completely platform-independent block & inline layout. They fall foul of the issue here because they happen to be written such that they depend on @font-face loading of Ahem, so they're vulnerable to the DirectWrite font machinery failing on us, but that's not really relevant to what they're trying to test.

Given this, I would be OK with just skipping this directory altogether on Win7/Debug, if we're finding that its failure rate is excessive. Maybe that's the easiest way forward here?
Flags: needinfo?(jfkthame)
Hi,

This bug is frequently failing on both autoland and inbound on Windows 7 platforms.

Can you please take a look at it?
Flags: needinfo?(jmaher)
thanks :kats, I am happy to fix the script that generates the .list files to make 2 of them, do you see any problems with it getting merged to the wpt repo?
Flags: needinfo?(jmaher)
AFAIK the script is not pushed upstream to the wpt repo, it's just a local thing we have. If you make changes to it such that it produces the right reftest.list file(s) from failures.list that should be fine. Somebody from the layout team (dholbert, xidorn, dbaron, etc) can review the change.

Also it sounds like this change is not going to get backed out, can you confirm? If so I'll rebase my patchset on top of it.
:kats, My plan is to:
1) disable writing_mode reftests on win7/debug only
2) fix the import-tests.py script to support both .list files

I see that import-tests.py and failures.list are local to mozilla-central; I have ran the script and there are a LOT of changes to the tests, I will focus on just updating the import-tests.py script and make sure it does the right thing.
(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #86)
> I see that import-tests.py and failures.list are local to mozilla-central; I
> have ran the script and there are a LOT of changes to the tests

That's weird. When I ran it yesterday there were only a few changes, it looked like somebody had changed a few ||Android clauses in either the reftest.list or failures.list without changing the other. I think there were only 5 or 6 lines that were out of sync.
it was last run January 17th according to import.log.  I have the latest bits from web-platform-tests and running import-tests.py against that, I see 494 files changed.

My goal isn't to update that, but it is to make sure that we have a working import-tests.py script.  Is there another process for updating the failures.list without the new tests?
(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #88)
> it was last run January 17th according to import.log.  I have the latest
> bits from web-platform-tests and running import-tests.py against that, I see
> 494 files changed.

Ah, maybe the difference is where we're getting the web-platform-tests from. I had cloned it from https://github.com/neerjapancholi/web-platform-tests.git which is the URL indicated in import.log. The HEAD revision at that fork is the same as what is indicated in import.log. If you're getting the latest wpt repo from the W3C org then that might have a different (newer) revision.
(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #88)
> My goal isn't to update that, but it is to make sure that we have a working
> import-tests.py script.  Is there another process for updating the
> failures.list without the new tests?

So I guess the answer to this question is to do what I did and use neejapancholi's clone of the wpt repo at the revision indicated in import.log. That way the "import" is effectively a no-op (except for the handful of Android changes that somebody made) and you can test your changes to the script that way.
Comment on attachment 8974071 [details] [diff] [review]
skip w3c-css font tests on win7/debug, fix import-tests.py

Review of attachment 8974071 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm. There are a couple of test failures in that try push -- anything to worry about?

Note that comment 85 recommends a review from the layout team, which seems like a good idea!
Attachment #8974071 - Flags: review?(gbrown) → review+
Comment on attachment 8974071 [details] [diff] [review]
skip w3c-css font tests on win7/debug, fix import-tests.py

There are some non font failures on my try push, I have done more retriggers to see what is really going on.

:xidorn, can you review this instead of me relying 100% on :gbrown.
Attachment #8974071 - Flags: superreview?(xidorn+moz)
Comment on attachment 8974071 [details] [diff] [review]
skip w3c-css font tests on win7/debug, fix import-tests.py

Review of attachment 8974071 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/reftests/w3c-css/received/reftest_writing_modes.list
@@ -676,5 @@
>  fails-if(Android) == css-writing-modes/sizing-orthog-htb-in-vlr-003.xht css-writing-modes/sizing-orthog-htb-in-vlr-003-ref.xht
>  == css-writing-modes/sizing-orthog-htb-in-vlr-004.xht css-writing-modes/sizing-orthog-htb-in-vlr-004-ref.xht
>  == css-writing-modes/sizing-orthog-htb-in-vlr-006.xht css-writing-modes/sizing-orthog-htb-in-vlr-006-ref.xht
>  == css-writing-modes/sizing-orthog-htb-in-vlr-007.xht css-writing-modes/sizing-orthog-htb-in-vlr-007-ref.xht
> -fails-if(OSX||winWidget||Android) == css-writing-modes/sizing-orthog-htb-in-vlr-008.xht css-writing-modes/sizing-orthog-htb-in-vlr-008-ref.xht

It would probably be better to add the ||Android bit to the failures.list file, since this change was made in https://hg.mozilla.org/integration/mozilla-inbound/rev/ed0c321f39cc on purpose. It looks like the failures.list file was left untouched as an oversight.

This test is actually failing on Android currently, so removing the fails-if clause will probably get your patch backed out.
thanks for mentioning that :kats, always something overlooked, in this case a compounding effect :)
thanks :kats, that cleared up a lot, I have updated the patch to annotate failures.list and re-ran import-tests.py, this simplified the resulting .list files.
Attachment #8974071 - Attachment is obsolete: true
Attachment #8974071 - Flags: superreview?(xidorn+moz)
Attachment #8974134 - Flags: review?(xidorn+moz)
Comment on attachment 8974134 [details] [diff] [review]
skip w3c-css font tests on win7/debug, fix import-tests.py

Review of attachment 8974134 [details] [diff] [review]:
-----------------------------------------------------------------

Since jfkthame is fine disabling this, I'm fine with this change.

I was thinking about whether Python can split an array into two based on a single predicate, but I didn't find anything... So I guess this approach is probably fine.

One small nit:

::: layout/reftests/w3c-css/import-tests.py
@@ +381,5 @@
> +    for t in gTestfiles:
> +        if 'css-writing-modes' in t:
> +            continue
> +        # This adds tests to gTests
> +        add_test_items(t)

It seems this was the only entry for `add_test_items` outside itself. It would probably be better if you just make `tests` a local variable and pass it into `add_test_items`, which should make things easier to reason about.
Attachment #8974134 - Flags: review?(xidorn+moz) → review+
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1ee5b2531836
skip css writing modes on win7/debug and fix import-tests.py to support reftest_writing_modes.list. r=gbrown,xidorn
Joel, could you please take a look? I noticed that this bug is marked as resolved fixed, but we still have a high number of issues after your latest push landed on central. Thanks!
Flags: needinfo?(jmaher)
there is no way we can fix this for good, the goal is to reduce the failure rates- we were at 400+ failures/week (when considering all the other bugs), starting today I would like to see what the new failure rate is going forward- it will take 1.5 weeks to gather enough data.
Status: RESOLVED → REOPENED
Flags: needinfo?(jmaher)
Resolution: FIXED → ---
looking at the win7 failures here, I see a large volume as opt/pgo, we had been focusing on debug last week.  Looking at the 137 windows failures in the last 2 days, win debug has a small number and none of those are in the new fonts suites.  Sadly apply --run-slower to the non font suites doesn't seem to solve this problem for the remaining win7/debug tests.

I looked in detail on the 137 failures:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1392106&endday=2018-05-15&startday=2018-05-13&tree=trunk

and annotated a list of failures and frequency:
https://docs.google.com/spreadsheets/d/1I4K5vwG7rTorF06CZLZA4arfsNIQut5hLAeWhTqbOYc/edit?usp=sharing

I suspect if we moved all of these 68 tests to reftest_fonts, we would have more issues and be reapeating this again (probably at a smaller scale).

A few options:
1) move a big chunk of tests and revisit in a week
2) run reftest_fonts on opt/pgo with --run-slower
3) backout reftest_fonts (and related patches)
4) run all win7 reftests with --run-slower (could be paired with #3)
5) stop running all reftests on windows 7 until we get support from the graphics team (maybe only try/mozilla-central and tier-2 status)
6) other ideas I don't know about

Personally I am leaning heavily towards #3 and #5.  Currently the reftests do not meet the requirements for tier-1 status ( https://wiki.mozilla.org/Sheriffing/Job_Visibility_Policy ), so we need to take this seriously and come up with a solution.
Flags: needinfo?(ryanvm)
Flags: needinfo?(jfkthame)
Flags: needinfo?(dbolter)
Flags: needinfo?(aryx.bugmail)
I agree with backing out reftest-fonts if it isn't helping anything. I think I'd still prefer to mass random-if the affected tests on Win7 (with great prejudice) rather than turning off all Win7 reftests, though.
Flags: needinfo?(ryanvm)
I don't have any better suggestions to offer here. As long as we're supporting Win7 (which I expect will be quite a few years yet), I don't think turning off reftests altogether is wise; that would mean we're basically running blind on one of the major platforms. But as long as we're stressing the system (or the graphics subsystem) to the point of frequent but random/unexplained failures, this is going to be painful.

It didn't used to be this bad, AFAIK. Presumably various things we've done over the past couple of years have meant that we're driving the graphics subsystem harder, and it just can't quite cope. But there's probably no single smoking gun that we could readily switch off; more likely it's an accumulation of incremental changes that have gradually made us more gfx-resource-hungry, and Win7 struggles to keep up.
Flags: needinfo?(jfkthame)
Adding Bas to the NI party.

Bas, based comment 41 and comment 42 I'm wondering if anything comes to mind around resource pressure + DirectWrite == missing glyphs? (You had a thought earlier that maybe sandboxing could be involved -- any leads there?)

Joel in comment 105 you mention 1.5 weeks to measure go forward rates... I'm guessing random-if'ing will mess that up?

I'm assuming actual users are sometimes seeing missing glyphs on windows 7 (32 bit)?
Flags: needinfo?(dbolter) → needinfo?(bas)
(In reply to Jonathan Kew (:jfkthame) from comment #41)
> Yes, I believe this is a graphics issue (as per comment 25 and following),
> and a solution would need to come from the Gfx team figuring out what we can
> do to mitigate the resource pressure, or whatever is failing.
> 
> Or we could give up testing (at least of text drawing) on Win7.
> 
> If DirectWrite is so unreliable on our test infrastructure, such that we get
> this rate of intermittent failures to draw some of the text, should we
> really be shipping it to users? Or would they be better served by just
> disabling DW on Win7 and using the old GDI font code? Milan...?

I want to reply to this noting that we've now switched to Skia, I don't think Chrome ever uses GDI on Skia anymore so it'd probably be a bad idea in the sense that we'd be using codepaths in Skia nobody else is using. But Lee Salzman would know more about this. (Did we ever use GDI with Skia -content-?)

(In reply to David Bolter [:davidb] (NeedInfo me for attention) from comment #113)
> Adding Bas to the NI party.
> 
> Bas, based comment 41 and comment 42 I'm wondering if anything comes to mind
> around resource pressure + DirectWrite == missing glyphs? (You had a thought
> earlier that maybe sandboxing could be involved -- any leads there?)

No, and I do feel that Jonathan's comment 21 is somewhat convincing here.

> Joel in comment 105 you mention 1.5 weeks to measure go forward rates... I'm
> guessing random-if'ing will mess that up?
> 
> I'm assuming actual users are sometimes seeing missing glyphs on windows 7
> (32 bit)?

Well, this is part of the thing, I haven't seen reports of this. Having said that, in near OOM situations in -general- we're aware that whole layers start not showing up and such issues. i.e. much more serious graphics issues could be hiding this.

Having said that the OOM thing sounds.. odd.. we should be regularly doing things in graphics that require more contiguous address space than running a glyph run analysis, and we'd expect those to cause issues much more frequently than this if we're that close to OOM.

Some suggestions as to diagnostic tests:

1) See if this issue occurs on Win7 with 64-bit builds.
2) See if this issue occurs on Win10 with 32-bit builds.
3) If someone can reproduce this locally, step into the Windows DLLs and see roughly 'where' the error code is being generated, symbol names and the surrounding asm should provide a good clue as to what's going on.. (i.e. is it near an open file or allocation call)

Those are roughly all the ideas I have at the moment.
Flags: needinfo?(bas)
Actually, one other simple idea:

4) Diagnostically, add some code along these lines:

if (hr == 0x80070005) {
  void* block = malloc(10 * 1024 * 1024);
  if (block) {
    free(block);
    log("10MB contiguously available");
  } else {
    log("failure to allocate 10MB contiguously");
  }
}

It seems unlikely that if there's 10MB available contiguously this function would ever run into OOM issues.

The only scenario in which I could see that happen, is if somehow this function was already internally creating the alpha texture (but it isn't, as far as I can tell), and somehow believes it needs to be insanely big (which again seems unlikely).
Flags: needinfo?(aryx.bugmail)
Attachment #8973649 - Attachment is obsolete: true
Attachment #8973650 - Attachment is obsolete: true
Attachment #8974134 - Attachment is obsolete: true
Keep in mind that you can't just restore the deleted lines to layout/reftests/w3c-css/received/reftest.list as the annotations have since changed in the reftest_writing_modes.list file. You'll need to merge those back together manually, or you'll clobber the annotation changes.
got it, I can fix that up
updated with small merge conflict fixed for android, and the import-tests.py ran again to pick up :kats' changes for writing-mode reftests.

I reviewed this line by line, and ran it on try for sanity:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=e1053bef4597d2e90e1b6526898ab49769966074
Attachment #8976944 - Attachment is obsolete: true
Attachment #8976944 - Attachment is obsolete: false
Attachment #8976944 - Attachment is obsolete: true
Attachment #8976972 - Attachment is patch: true
Backout by dluca@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ea150e0ca4a9
backout 3 patches (1ee5b2531836, cac593a84f72, 95ccdb87f63f) from bug 1392106 for not fixing font rendering problems.
This is causing the macosx64-qr opt reftest job to permafail. I believe it needs to be split into two or more chunks in order to work. I remember having this problem before I turned those tests on, and by the time I landed it, the font stuff was in a separate job so the remaining non-font job would run in time. Now that everything is back in one job it's too much and times out.
Comment on attachment 8977097 [details]
Bug 1392106 - use 2 chunks for reftests on OS X to prevent timeout with quantumrender.

https://reviewboard.mozilla.org/r/245176/#review251176
Attachment #8977097 - Flags: review?(bugmail) → review+
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/autoland/rev/d52c1a18a259
use 2 chunks for reftests on OS X to prevent timeout with quantumrender. r=kats
Pushed by aciure@mozilla.com:
https://hg.mozilla.org/mozilla-central/rev/b54f574a1dd2
use 2 chunks for reftests on OS X to prevent timeout with quantumrender. r=kats a=permafail-fix
Pushed by xquan@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f51587925bf1
Followup of bug 1392106 backout to remove reftest_writing_modes.list.
I think 3 or 4 rounds of changes like this and we will have 90% of the issues gone.
Attachment #8979934 - Flags: review?(ryanvm)
Comment on attachment 8979934 [details] [diff] [review]
random-if many tests

Review of attachment 8979934 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the comment addressed about random-if vs. fails-if precedence in the applicable files. Thanks for working on this, Joel.

::: layout/reftests/font-features/reftest.list
@@ +90,5 @@
>  random-if(!winWidget&&!cocoaWidget) == fwid-spaces.html fwid-spaces-ref.html
>  # Arial/Times New Roman on Win7+/OSX 10.6+ have kerning pairs that include spaces
>  random-if(!winWidget&&!cocoaWidget) fails-if(winWidget||cocoaWidget) != kerning-spaces-arial-nokern.html kerning-spaces-arial-default.html
>  random-if(!winWidget&&!cocoaWidget) fails-if(winWidget||cocoaWidget) == kerning-spaces-arial-kern.html kerning-spaces-arial-default.html
> +random-if(!winWidget&&!cocoaWidget) random-if(/^Windows\x20NT\x206\.1/.test(http.oscpu))  fails-if(winWidget||cocoaWidget) != kerning-spaces-tnr-nokern.html kerning-spaces-tnr-default.html

Yowza this is confusing.

::: layout/reftests/font-matching/reftest.list
@@ +26,5 @@
>  != synthetic-bold-2.html synthetic-bold-2-ref.html
>  
>  # synthetic bold/italic tests
>  != defaultfont-bold.html defaultfont.html
>  != defaultfont-italic.html defaultfont.html 

Remove the trailing whitespace while you're here?

::: layout/reftests/w3c-css/failures.list
@@ +58,5 @@
>  fails css-writing-modes/float-rgt-orthog-vrl-in-htb-003.xht
>  fails css-writing-modes/sizing-orthog-htb-in-vrl-001.xht
>  fails css-writing-modes/sizing-orthog-htb-in-vrl-004.xht
>  fails css-writing-modes/sizing-orthog-htb-in-vrl-013.xht
> +random-if(/^Windows\x20NT\x206\.1/.test(http.oscpu)) fails-if(OSX||winWidget||Android) css-writing-modes/sizing-orthog-htb-in-vlr-008.xht

I'm a bit worried that the fails-if(winWidget) will take precedence over the random-if condition here. Can you please double-check that these are ordered correctly (and obviously applicable to the other ones as well)?
Attachment #8979934 - Flags: review?(ryanvm) → review+
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/aa92307e79b5
random-if many tests on win7 to avoid missing letters. r=RyanVM
thanks for catching that, put random-if at the end of the line as that is what we use.

In 2 more days I want to see what is showing up- hopefully a smaller list.
here is another one for you RyanVM!
Attachment #8980650 - Flags: review?(ryanvm)
Comment on attachment 8980650 [details] [diff] [review]
round 2 of random-if win7 tests

Review of attachment 8980650 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/reftests/transform-3d/reftest.list
@@ +23,5 @@
>  fuzzy-if(gtkWidget,4,200) fuzzy-if(Android,4,300) fuzzy-if(winWidget&&!layersGPUAccelerated,2,100) fuzzy-if(skiaContent,16,100) == preserve3d-5a.html preserve3d-5-ref.html
>  == preserve3d-6a.html preserve3d-6-ref.html
>  == scale3d-z.html scalez-1-ref.html
> +fuzzy-if(winWidget,143,689) fuzzy-if(OSX,224,924) fuzzy-if(winWidget&&stylo,154,644) random-if(/^Windows\x20NT\x206\.1/.test(http.oscpu)) == scale3d-all.html scale3d-1-ref.html # subpixel AA
> +fuzzy-if(winWidget,143,689) fuzzy-if(OSX,224,924) fuzzy-if(winWidget&&stylo,154,644) random-if(/^Windows\x20NT\x206\.1/.test(http.oscpu)) == scale3d-all-separate.html scale3d-1-ref.html # subpixel AA

Since we're only testing with Stylo enabled at this point, we could simplify these.
Attachment #8980650 - Flags: review?(ryanvm) → review+
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0b3999ac58f6
random-if many more tests on win7. r=RyanVM
(In reply to Pulsebot from comment #150)
> Pushed by jmaher@mozilla.com:
> https://hg.mozilla.org/integration/mozilla-inbound/rev/0b3999ac58f6
> random-if many more tests on win7. r=RyanVM

In comment 149, I meant we could consolidate the two winWidget conditions into one, not just remove Stylo from the one :)
Flags: needinfo?(jmaher)
I removed the stylo condition in what I landed- I suspect I will be doing a few more rounds of this and can simplify this.
Flags: needinfo?(jmaher)
Shouldn't all these random-if() comments point back to this bug so that we know to remove them when the underlying problem is fixed?
Flags: needinfo?(jmaher)
Duplicate of this bug: 1459799
Duplicate of this bug: 1464731
Duplicate of this bug: 1460138
this is my next attempt taking data from the last 2 calendar days.  Many similar files touched :)

As a note, I started adding in bug # on this patch and fixing up all references in the files I was editing- this review will have a lot of changed lines just for bug #.  I did not add the bug number for the reftest/layout/w3c-css/ files since the import-tests.py script didn't work well with the inline comment.

I will do a query once this lands and add a followup patch to annotate !w3c-css tests with the proper bug number- including all future ones.  This seems a bit pointless or a couple reasons:
1) the majority of annotations in reftest.list file have no bug references
2) this bug has been open for 9+ months with no action from the graphics team, I suspect we will stop testing on win7 in the next couple years and this will never get fixed

despite my two points, it is still a good idea to have bug references.
Flags: needinfo?(jmaher)
Attachment #8981411 - Flags: review?(ryanvm)
Comment on attachment 8981411 [details] [diff] [review]
3rd round of random-if statements

Review of attachment 8981411 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/reftests/bugs/reftest.list
@@ +686,5 @@
>  == 379178-xhtml.xhtml 379178-xhtml-ref.xhtml
>  == 379178-html.html 379178-html-ref.html
>  == 379178-svg.svg 379178-svg-ref.svg
>  fuzzy-if(skiaContent,1,500) == 379316-1.html 379316-1-ref.html
> +fails-if(Android) random-if(cocoaWidget) random-if(/^Windows\x20NT\x206\.1/.test(http.oscpu)) fuzzy-if(winWidget,1,180) fuzzy-if(gtkWidget,1,191) fuzzy-if(skiaContent,8,500) == 379316-2.html 379316-2-ref.html # bug 379786, Bug 1392106

I think ordering with that winWidget fuzzy-if is going to get you in trouble here. Might be worth seeing which platform that fuzzy-if was added for too in case it was for win7-only problems and we can just remove it outright.

Also makes me wonder about the cocoaWidget random-if and skiaContent fuzzy-if :-)

@@ +1815,5 @@
>  == 960277-1.html 960277-1-ref.html
>  fuzzy-if(skiaContent,1,80) == 961887-1.html 961887-1-ref.html
>  == 961887-2.html 961887-2-ref.html
>  == 961887-3.html 961887-3-ref.html
> +pref(layout.css.overflow-clip-box.enabled,true) random-if(/^Windows\x20NT\x206\.1/.test(http.oscpu)) fuzzy(50,145) fuzzy-if(asyncPan&&!layersGPUAccelerated,102,3712) fuzzy-if(webrender,255,51) == 966992-1.html 966992-1-ref.html # Bug 1392106

Will this conflict with the fuzzy(50,145) condition?
Attachment #8981411 - Flags: review?(ryanvm) → review+
and the fun continues
Attachment #8982329 - Flags: review?(ryanvm)
Comment on attachment 8982329 [details] [diff] [review]
4th round of random-if statements

Review of attachment 8982329 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/reftests/border-radius/reftest.list
@@ +50,5 @@
>  fuzzy-if(/^Windows\x20NT\x206\.2/.test(http.oscpu),1,5) == clipping-5-image.html clipping-5-refi.html
>  fuzzy-if(/^Windows\x20NT\x206\.2/.test(http.oscpu),1,5) fuzzy-if(skiaContent,1,77) == clipping-5-overflow-hidden.html clipping-5-ref.html
>  fuzzy-if(/^Windows\x20NT\x206\.2/.test(http.oscpu),1,5) fuzzy-if(Android,5,21) fuzzy-if(skiaContent,1,97) == clipping-5-refi.html clipping-5-ref.html
>  fuzzy-if(true,1,7) fuzzy-if(d2d,55,95) fuzzy-if(cocoaWidget,1,99) fuzzy-if(Android,99,115) fuzzy-if(skiaContent,1,77) == clipping-5-refc.html clipping-5-ref.html # bug 732535
> +fuzzy-if(winWidget,105,71) fuzzy-if(Android,8,469) fuzzy-if(skiaContent,21,74) fuzzy-if(d3d11&&advancedLayers,120,319) fuzzy-if(winWidget,144,335) fuzzy-if(webrender&&cocoaWidget,98-98,279-279) random-if(/^Windows\x20NT\x206\.1/.test(http.oscpu)) == clipping-6.html clipping-6-ref.html # PaintedLayer and MaskLayer with transforms that aren't identical, bug 1392106

This has two different fuzzy-if(winWidget) conditions in it? Can we please clean that up?

::: layout/reftests/selection/reftest.list
@@ +47,5 @@
>  == invalidation-2c.html invalidation-2-ref.html
>  == invalidation-2d.html invalidation-2-ref.html
>  == invalidation-2e.html invalidation-2-ref.html
>  == invalidation-2f.html invalidation-2-ref.html
> +fuzzy(7,2) fuzzy-if(OSX,1,1) fails-if(isDebugBuild&&!browserIsRemote) fails-if(Android) random-if(/^Windows\x20NT\x206\.1/.test(http.oscpu)) needs-focus == rtl-selection-with-decoration.html rtl-selection-with-decoration-ref.html # Bug 1392106

If this already has a fuzzy(7,2) on it, why also a fuzzy-if(OSX,1,1)?
Attachment #8982329 - Flags: review?(ryanvm) → review+
:ryanvm, you ask questions about existing syntax, not my changes :)  I did fix the double fuzzy-if(winWidget) in border-radius.

I am not sure about the fuzzy(7,2) and fuzzy-if(OSX,1,1).  That was added in bug 1399310, and here is the diff:
https://reviewboard.mozilla.org/r/192782/diff/4#index_header
Just because you didn't add the initial insanity doesn't mean you can't clean it up while you're there anyway :)
I just don't know what to clean that up to be, I assume we can remove it, but it was originally added with that, so I am not sure why.  The patch author is not available.  Maybe :jfkthame would know what to do?
Flags: needinfo?(jfkthame)
If I were going to make a guess, I'd say that it's taking advantage of superseding so that OSX has a more restrictive fuzz tolerance than the rest. Let's not worry about it, just leave it :)
Flags: needinfo?(jfkthame)
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c16c9198979a
random-if more win7 tests for missing letters. r=RyanVM
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5b4895ad57fc
Tweak the fuzzing conditions for clipping-6.html to work around win10-qr reftest failures. r=jmaher
(In reply to Jonathan Kew (:jfkthame) from comment #4)
> The failures here seem very bizarre, with just an individual glyph
> disappearing when other glyphs in the same font are fine. So this isn't a
> whole-font failure (like the ones we're currently seeing where a webfont
> apparently hasn't loaded, and we get a fallback instead); it's at the level
> of individual glyphs.
> 
> Maybe some kind of resource constraint deep in the graphics system whereby a
> rasterized glyph gets evicted from a cache, and not reliably regenerated in
> time for when we paint it? Or an entry in a glyph cache is getting corrupted
> somehow?

This failure of layout/reftests/w3c-css/submitted/counter-styles-3/descriptor-pad.html :
https://treeherder.mozilla.org/logviewer.html#?job_id=181212221&repo=mozilla-inbound&lineNumber=12507
makes me somewhat suspicious of that theory, given that the failure involves one of the three ā characters in the test going missing (the first one is missing in the test, and the third is missing in the reference, but in both cases the other two are fine).
Duplicate of this bug: 1427728
Whiteboard: [gfx-noted][PI:May][stockwell disable-recommended] → [gfx-noted][PI:June][stockwell disable-recommended]
and another round
Attachment #8983541 - Flags: review?(ryanvm)
Attachment #8983541 - Flags: review?(ryanvm) → review+
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4ab0fba87583
random-if(win7) reftest font rendering failures. r=RyanVM
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5dcfede51bc7
fix w3c-css received tests manifest conflicts. r=me, a=testonly CLOSED TREE
In the last 7 days, there are 33 failures on this bug.
They occur on windows7-32 (opt, debug, pgo), android-4-3-armv7-api16 (debug).
Recent failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=183295595&repo=mozilla-inbound&lineNumber=35197
Duplicate of this bug: 1451144
this is a whole week worth of failures- not bad!
Attachment #8987789 - Flags: review?(ryanvm)
Comment on attachment 8987789 [details] [diff] [review]
6th round of random-if statements

Review of attachment 8987789 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/reftests/bugs/reftest.list
@@ +1647,5 @@
>  fuzzy-if(d2d,5,1) == 622585-1.html 622585-1-ref.html # bug 789402
>  fuzzy-if(Android,8,300) fuzzy-if(skiaContent,1,40000) == 625409-1.html 625409-1-ref.html
>  == 627393-1.html about:blank
>  fuzzy-if(skiaContent,1,500) == 630835-1.html about:blank
> +random-if(/^Windows\x20NT\x206\.1/.test(http.oscpu)) == 631352-1.html 631352-1-ref.html # bug 1309426

Is this the right bug #?

::: layout/reftests/css-break/reftest.list
@@ +1,5 @@
>  default-preferences pref(layout.css.box-decoration-break.enabled,true)
>  
>  == box-decoration-break-1.html box-decoration-break-1-ref.html
>  fuzzy(1,20) fuzzy-if(skiaContent,1,700) fuzzy-if(webrender,21-26,8910-12357) == box-decoration-break-with-inset-box-shadow-1.html box-decoration-break-with-inset-box-shadow-1-ref.html
> +skip-if(verify) random-if(/^Windows\x20NT\x206\.1/.test(http.oscpu)) fuzzy(45,460) fuzzy-if(skiaContent,57,439) fuzzy-if(Android,57,1330) == box-decoration-break-with-outset-box-shadow-1.html box-decoration-break-with-outset-box-shadow-1-ref.html # Bug 1386543, bug 1392106

Please verify that the generic fuzzy() here won't supersede the random-if. Also, Win7 Ru is skiaContent:true, so be careful with accidental superseding from that as well.

::: layout/reftests/svg/text/reftest.list
@@ +200,4 @@
>  fuzzy-if(skiaContent,1,100) fuzzy-if(webrender,1-1,575-575) needs-focus fuzzy-if(webrender&&!gtkWidget,134-148,294-318) == simple-bidi-selection.svg simple-bidi-selection-ref.html
>  fuzzy-if(skiaContent,1,50) fuzzy-if(webrender,1-1,237-237) needs-focus fuzzy-if(webrender&&!gtkWidget,127-148,253-254) == simple-fill-color-selection.svg simple-fill-color-selection-ref.html
> +random-if(/^Windows\x20NT\x206\.1/.test(http.oscpu)) fuzzy-if(skiaContent,1,150) fuzzy-if(webrender,1-1,222-222) needs-focus fuzzy-if(webrender&&!gtkWidget,127-148,251-254) == simple-underline-selection.svg simple-underline-selection-ref.html # Bug 1392106
> +random-if(/^Windows\x20NT\x206\.1/.test(http.oscpu)) fuzzy-if(skiaContent,1,300) fuzzy-if(webrender,1-1,934-934) needs-focus fuzzy-if(webrender&&!gtkWidget,134-152,494-501) == multiple-text-selection.svg multiple-text-selection-ref.html # Bug 1392106

Same skiaContent:true comment as before.
Attachment #8987789 - Flags: review?(ryanvm) → review+
thanks for the tips- I just moved random-if to the end of the line to accommodate for these changes and fixed the bug number.
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a34b5d6774a9
random-if more win7 reftest font rendering failures. r=RyanVM
Duplicate of this bug: 1473886
Duplicate of this bug: 1474136
Duplicate of this bug: 1474144
Duplicate of this bug: 1474161
Duplicate of this bug: 1474220
First round of folded-up ESR60 backports:
https://hg.mozilla.org/releases/mozilla-esr60/rev/92812f2cb1b6
Target Milestone: mozilla62 → ---
Duplicate of this bug: 1476120
Duplicate of this bug: 1475923
Duplicate of this bug: 1462747
Duplicate of this bug: 1462750
Duplicate of this bug: 1476471
Duplicate of this bug: 1476618
Duplicate of this bug: 1470834
Duplicate of this bug: 1460150
Duplicate of this bug: 1479402
Duplicate of this bug: 1480629
There are 99 failures in the last 7 days.
They occur on windows7-32 (opt, pgo, debug), windows7-32-devedition (opt), windows7-32-msvc (opt, debug), windows7-32-nightly (opt).
Recent failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=193582623&repo=mozilla-beta&lineNumber=24187
Duplicate of this bug: 1462245
Duplicate of this bug: 1472245
Duplicate of this bug: 1484729
Duplicate of this bug: 1485165
random-if statements for the last 2 days worth of annotated failures
Comment on attachment 9003435 [details]
Bug 1392106 - random-if more test cases for windows 7 letter rendering failures. r=RyanVM

Ryan VanderMeulen [:RyanVM] has approved the revision.
Attachment #9003435 - Flags: review+
Pushed by rvandermeulen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b168deee0bcf
random-if more test cases for windows 7 letter rendering failures. r=RyanVM
Backed out changeset b168deee0bcf (bug 1392106) for failures on canvas/1304353-text-global-composite-op-1.html on a CLOSED TREE

Backout link: https://hg.mozilla.org/integration/autoland/rev/c847a2f8b428fa60e5d74bb0db824b88e17dff8b

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=b168deee0bcf1a33737941eb78acc065a1c9c053

Log link: https://treeherder.mozilla.org/logviewer.html#?job_id=195520676&repo=autoland&lineNumber=3667

Log snippet: 

13:54:19     INFO - REFTEST TEST-START | file:///Z:/task_1535031549/build/tests/reftest/tests/layout/reftests/canvas/1304353-text-global-composite-op-1.html == file:///Z:/task_1535031549/build/tests/reftest/tests/layout/reftests/canvas/1304353-text-global-composite-op-1-ref.html2
13:54:19     INFO - REFTEST TEST-LOAD | file:///Z:/task_1535031549/build/tests/reftest/tests/layout/reftests/canvas/1304353-text-global-composite-op-1.html | 68 / 74 (91%)
13:54:19     INFO - REFTEST TEST-LOAD | file:///Z:/task_1535031549/build/tests/reftest/tests/layout/reftests/canvas/1304353-text-global-composite-op-1-ref.html2 | 68 / 74 (91%)
13:59:19     INFO - REFTEST TEST-UNEXPECTED-FAIL | file:///Z:/task_1535031549/build/tests/reftest/tests/layout/reftests/canvas/1304353-text-global-composite-op-1.html == file:///Z:/task_1535031549/build/tests/reftest/tests/layout/reftests/canvas/1304353-text-global-composite-op-1-ref.html2 | load failed: timed out after 300000 ms waiting for 'load' event for file:///Z:/task_1535031549/build/tests/reftest/tests/layout/reftests/canvas/1304353-text-global-composite-op-1-ref.html2
13:59:19     INFO - REFTEST INFO | Saved log: START file:///Z:/task_1535031549/build/tests/reftest/tests/layout/reftests/canvas/1304353-text-global-composite-op-1.html
13:59:19     INFO - REFTEST INFO | Saved log: [CONTENT] OnDocumentLoad triggering AfterOnLoadScripts
13:59:19     INFO - REFTEST INFO | Saved log: Initializing canvas snapshot
13:59:19     INFO - REFTEST INFO | Saved log: DoDrawWindow 0,0,800,1000
13:59:19     INFO - REFTEST INFO | Saved log: [CONTENT] RecordResult fired
13:59:19     INFO - REFTEST INFO | Saved log: RecordResult fired
13:59:19     INFO - REFTEST INFO | Saved log: START file:///Z:/task_1535031549/build/tests/reftest/tests/layout/reftests/canvas/1304353-text-global-composite-op-1-ref.html2
Flags: needinfo?(jmaher)