<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Comment 5

•

8 years ago

No, I didn't see anything obvious there, either. Given how sporadic this is, and that it seems to be able to hit any test, I'm guessing that maybe it happens when a particularly slow GC (or something like that) happens at a bad moment and causes excessive jank at exactly the wrong time, such that we paint before the font load and following restyle is finished. Perhaps the risk was always there, but some recent changes affecting timing, speed, parallelism, etc., mean it has gone from being a one-in-a-million chance to one-in-a-thousand. I wonder if we should be checking something like document.fonts.status != 'loading' before allowing the snapshot to be taken?

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

8 years ago

Blocks: 1391492

Comment hidden (Intermittent Failures Robot)

Updated

•

8 years ago

Blocks: 1391762

Comment 7

•

8 years ago

Replying to bug 1391762 comment 7 (moving discussion here as it's the more general bug): > It's not clear to me why these font load issues weren't an issue before, but > presumably we need to teach the reftest harness to wait for downloadable > font loads to complete. That was my initial thought, but I don't think it's really an explanation. We have some tests that involve a -lot- of font loads that take multiple seconds (e.g. a bunch of tests in layout/reftests/font-matching/stretchmapping-* and weightmapping-*), and when running these tests locally you can watch as the various fonts get loaded and the rendering "evolves" as different elements of the testcase get painted. If the harness weren't waiting for font loads, ISTM this would always be failing - but it doesn't, the harness does wait until the fonts are loaded and then takes the snapshot it needs. AFAICS, the tests have never done anything special to cause the harness to wait for fonts, nor to ensure "early" loading of @font-face resources. I assume, therefore (but David, you may know better!) that what happens is something along the lines that any font loads triggered during initial reflow will block the load event, as they add to the page's dependent resources, and reftest is waiting for onload to fire before it proceeds. It's also worth noting bug 1391515 comment 5. The example there is different in that it doesn't involve any webfonts, only system-installed fonts, yet seems similar in that one font face apparently failed to be used in the testcase (although its bold face worked, and both regular and bold faces worked in the reference file immediately afterwards). So perhaps the (highly intermittent) failure we're seeing is not really specific to webfonts at all, but a more general failure in styling or in font instantiation. Or maybe that was an unrelated case.

Cameron McCormack (:heycam)

Comment 8

•

8 years ago

(In reply to Jonathan Kew (:jfkthame) from comment #7) > We have some tests that involve a -lot- of font loads that take multiple > seconds (e.g. a bunch of tests in > layout/reftests/font-matching/stretchmapping-* and weightmapping-*), and > when running these tests locally you can watch as the various fonts get > loaded and the rendering "evolves" as different elements of the testcase get > painted. FWIW, one of the best tests for watching this (in that it takes a long time to load its fonts) seems to be layout/reftests/font-matching/font-stretch-1.html, which loads a whole collection of differently-styled DejaVu faces, and -- at least when running my local opt+debug build -- can be seen gradually re-rendering various elements as the font faces come in. The onload event doesn't fire until all the loads have completed (and so multiple rounds of reflow and paint have happened), and only then does the reftest harness grab the result.

Comment hidden (Intermittent Failures Robot)

Comment 10

•

8 years ago

Has this only been since stylo was switched on by default?

Comment hidden (Intermittent Failures Robot)

Assignee

Comment 12

•

8 years ago

this started about 2 weeks after stylo was on by default and from the bugs I have seen this is windows 7 only (including default + stylo-disabled)

Assignee

Updated

•

8 years ago

Blocks: 1383422, 1393256, 1386941, 1383476, 1383577

Comment 13

•

8 years ago

(In reply to Cameron McCormack (:heycam) from comment #10) > Has this only been since stylo was switched on by default? I don't think so; AFAICS, that happened around 09/06, but this was filed a few days before that (and refers to "previous failures" that had already been happening). Or was stylo enabled significantly earlier than that? It does seem to have become a big problem in roughly the same timescale as stylo getting enabled, but some failures are showing up on non-stylo test runs as well, so I'm not sure whether there's a real connection or just an unfortunate correlation. It's also unclear to me at this point whether the problem here is really style- or reflow-related at all, or if it's a Graphics-level problem whereby we're getting occasional font instantiation failures (most commonly with a newly-loaded webfont, where timing might easily come into play somehow, but there are some examples that appear to show failure with a system font, too).

Assignee

Updated

•

8 years ago

Blocks: 1383572

Assignee

Updated

•

8 years ago

Blocks: 1389321

Assignee

Updated

•

8 years ago

Blocks: 1390621

Assignee

Updated

•

8 years ago

Blocks: 1390719

Assignee

Updated

•

8 years ago

Blocks: 1400993

Assignee

Comment 14

•

8 years ago

as a note, I am only looking at font related bugs that show up 30+ times/week, there could be many more.

Assignee

Updated

•

8 years ago

Blocks: 1406444

Assignee

Updated

•

8 years ago

Blocks: 1319064

Assignee

Comment 15

•

8 years ago

I did some experiments by running reftest in 32 chunks instead of 8: https://treeherder.mozilla.org/#/jobs?repo=try&author=jmaher@mozilla.com&fromchange=6ac496c794145ddeaa366bb20d767030bb2683e5&tochange=6fe4911192ec13f2a8541146cda206c70abb8c24 you can see many more failures with 8 chunks- should we consider doing this approach? This would be for windows 7.

Comment 16

•

8 years ago

(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #14) > as a note, I am only looking at font related bugs that show up 30+ > times/week, there could be many more. Indeed, there are lots on file that only show up a handful of times - perhaps only once - but look like they're the same underlying problem. In general, I think testcases that use more fonts, particularly @font-face resources, are much more likely to break, but pretty much any reftest with text in it -might- get unlucky. I'm not surprised that running in smaller chunks is helpful. It looks like with that change, src-list-local-full is the worst remaining offender (although it's still far from consistent).

Comment 17

•

8 years ago

Interesting that your try run shows inconsistency even within how src-list-local-full fails. The common(er) case is that the first chunk of the testcase fails to load the expected src:local font, and falls back to serif: https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://queue.taskcluster.net/v1/task/TFZ5vyScRBqj_9FzNWWQfw/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1 but in one run, the first -two- chunks of the test both failed: https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://queue.taskcluster.net/v1/task/BP0OcVXzSiW6cKmoLI0Y3Q/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1 Just an observation; I don't see that it particularly helps us figure out the nature of the failure.

Comment 18

•

8 years ago

Updating the title here to be more generic, as not all the failures relate to loading WOFF fonts (that's just a particularly common scenario where it shows up); the underlying issue seems to be wider than that.

Summary: Intermittent race between taking the reftest and loading the WOFF → Intermittent Win7 reftest failures due to failure to load/use the expected font [was: Intermittent race between taking the reftest and loading the WOFF]

Assignee

Comment 19

•

8 years ago

another technique would be to run the font-face tests in their own suite, then we could make it tier-2 until this is fixed. It is odd that if we split it into chunks and disable src-list-local-full that we would have a low intermittency rate. Maybe less time the browser is up and running before starting the font-face tests. Also why is this windows 7 only, not on windows7 debug either?

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 20

•

8 years ago

(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #19) > another technique would be to run the font-face tests in their own suite, > then we could make it tier-2 until this is fixed. Note that although this is most common among @font-face tests, there are also scattered occurrences all over the place in other reftest directories. So while demoting these tests would probably reduce the noise considerably, it wouldn't entirely solve it. > It is odd that if we split it into chunks and disable src-list-local-full > that we would have a low intermittency rate. I'm guessing that disabling that particular test wouldn't actually help much, as we'd tend to get the failure on the following test instead. I notice that src-list-local-full is the first test in your reftest-18 chunk that uses @font-face with src:local(), which means it probably triggers a bunch of work inside the gfxPlatformFontList code to retrieve font names for matching, and maybe that doesn't reliably complete in time. Disabling this test would just mean that the next test that uses src:local() will hit the same issue. > Also why is this windows 7 only, not on windows7 debug either? I don't know, but perhaps it's timing-related; e.g. because the opt builds run faster, they accumulate more cruft from earlier tests in between clean-up by periodic GCs, and so tend to have higher resource-usage peaks? Or if it really is a race condition of some kind (rather than a temporary resource-exhaustion problem), any kind of timing changes (whether faster or slower) could affect it.

Updated

•

8 years ago

Comment 22

•

8 years ago

as it currently stands reftest does not meet the requirements of tier-1 status, I think if we do not have any traction on this bug by end of the week we will mark windows7 reftests as tier-2 and ask the sheriffs to stop starring failures. I believe reftests are stable on other platforms- possibly we can just disable them on windows7 altogether and save some resources?

Comment 23

•

8 years ago

Disabling on Win7 is not an acceptable option. There are significant differences in config between Win7 and Win10 that affect how reftests run (D2D) and losing that coverage isn't OK.

Assignee

Comment 24

•

8 years ago

ok, but this doesn't meet our tree visibility requirements, we need to address this one way or another and I wanted to start that converation here. If the jobs are hidden and not visible by default, will we still get value from the D2D win7 tests running? I am not eager to turn them off, just to solve this problem.

Comment 25

•

8 years ago

How did we go from "let's make them tier-2" to "hidden and not visible by default" in the span of two comments?

Comment 26

•

8 years ago

ISTM a better way forward would be to run them in smaller chunks, as per comment 15; with 32 chunks the failure rate drops substantially, and maybe we'd then be able to selectively disable just the worst remaining offenders on Win7?

Assignee

Comment 27

•

8 years ago

I see a >50% failure rate on rg5: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-searchStr=windows%20reftest%20rg5&fromchange=e22de36807c6229b3710382f0094ffe5e31eaa26 I am happy to do the 32 chunks and then consider disabling some tests, I think we will need to disable the font-face tests for win7 given the fact that the first test is failing and we suspect the first test will often fail.

Comment 28

•

8 years ago

I tried to find a regression range for this, but couldn't find anything conclusive. See https://bugzilla.mozilla.org/show_bug.cgi?id=1386941#c12, for example.

Updated

•

8 years ago

Blocks: 1384502

Updated

•

8 years ago

Blocks: 1407210

Updated

•

8 years ago

Blocks: 1386210

Updated

•

8 years ago

Blocks: 1390900

Updated

•

8 years ago

Blocks: 1386506

Updated

•

8 years ago

Blocks: 1387599

Updated

•

8 years ago

Blocks: 1389357

Updated

•

8 years ago

Blocks: 1404026

Updated

•

8 years ago

Depends on: 1402899

Updated

•

8 years ago

Blocks: 1392144

Updated

•

8 years ago

Blocks: 1387928

Assignee

Comment 29

•

8 years ago

Attached patch run windows 7 reftests in 32 chunks and skip a most frequent failure (obsolete) — Details — Splinter Review

here is a try run, we see 1 failure in 50 on the font tests: https://treeherder.mozilla.org/#/jobs?repo=try&revision=504b68a70e5404c9b2137fef4189ccaad19209b3

Assignee: nobody → jmaher

Status: NEW → ASSIGNED

Attachment #8918401 - Flags: review?(jfkthame)

Comment 30

•

8 years ago

I notice the one failure on each of Opt and PGO occurred on different tests, . Given this, what I'd prefer us to try would be landing the patch to run in 32 chunks, but without the test-disabling line, and see how that goes. If we find we're still getting a significant number of failures on that particular test, we can do a followup to disable it, but I'd like a chance to see how consistent (or not) the failures are with the smaller chunks, before we do so. Note that the next test after the one you disabled (src-list-local-full-quotes.html) loads exactly the same fonts as src-list-local-full.html, and renders the same text; the only difference is the use of quote-marks on the font names in the second version. It's hard to see how one of these would be more reliable than the other, when running in the same environment. So I think we should give it a little time to see how things go in the new smaller-chunks world, before targeting specific test(s) to disable.

Assignee

Comment 31

•

8 years ago

Attached patch run windows 7 reftests in 32 chunks — Details — Splinter Review

here is the patch with just the larger chunks

Attachment #8918401 - Attachment is obsolete: true

Attachment #8918401 - Flags: review?(jfkthame)

Attachment #8918543 - Flags: review?(jfkthame)

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 32

•

8 years ago

Comment on attachment 8918543 [details] [diff] [review] run windows 7 reftests in 32 chunks Review of attachment 8918543 [details] [diff] [review]: ----------------------------------------------------------------- Thanks; let's see how this goes, and what the mix of residual failures looks like afterwards.

Attachment #8918543 - Flags: review?(jfkthame) → review+

Pulsebot

Comment 33

•

8 years ago

Pushed by jmaher@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/f68eef8bbd21 run reftest in smaller browser sessions on windows 7 to reduce font loading/rendering issues. r=jfkthame

Pulsebot

Comment 34

•

8 years ago

Pushed by archaeopteryx@coole-files.de: https://hg.mozilla.org/integration/mozilla-inbound/rev/039963b03830 run reftest in smaller browser sessions on windows 7 to reduce font loading/rendering issues: fix indentation. r=linting-fix

Comment 35

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/f68eef8bbd21 https://hg.mozilla.org/mozilla-central/rev/039963b03830

Status: ASSIGNED → RESOLVED

Closed: 8 years ago

status-firefox58: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla58

Updated

•

8 years ago

status-firefox57: --- → affected

https://hg.mozilla.org/releases/mozilla-beta/rev/c27727efc446

Comment 36

•

8 years ago

bugherder uplift

status-firefox57: affected → fixed

Comment hidden (Intermittent Failures Robot)

Updated

•

8 years ago

Blocks: 1424586

Comment hidden (Intermittent Failures Robot)

Gregory Szorc [:gps]

Updated

•

8 years ago

Blocks: 1431291

Gregory Szorc [:gps]

Comment 40

•

8 years ago

Sweeping this bug under the rug by increasing the chunk count increased our AWS costs by >$250,000 annually. More details in bug 1431291.

Assignee

Comment 41

•

8 years ago

possibly another related bug 1407210