Closed Bug 550163 Opened 14 years ago Closed 10 years ago

text/font-related reftest failures on Windows 7

Categories

(Core :: Layout: Text and Fonts, defect)

x86
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla2.0

People

(Reporter: jfkthame, Assigned: jfkthame)

Details

Attachments

(22 files, 4 obsolete files)

464.25 KB, application/octet-stream
Details
5.82 KB, patch
roc
: review+
Details | Diff | Splinter Review
1.55 KB, patch
smontagu
: review+
Details | Diff | Splinter Review
1.72 KB, patch
roc
: review+
Details | Diff | Splinter Review
1.19 KB, patch
roc
: review+
Details | Diff | Splinter Review
860 bytes, patch
roc
: review+
Details | Diff | Splinter Review
1.24 KB, patch
roc
: review+
Details | Diff | Splinter Review
1.12 KB, patch
roc
: review+
Details | Diff | Splinter Review
2.71 KB, patch
roc
: review+
Details | Diff | Splinter Review
1.05 KB, patch
roc
: review+
Details | Diff | Splinter Review
1009 bytes, patch
roc
: review+
Details | Diff | Splinter Review
1.10 KB, patch
roc
: review+
Details | Diff | Splinter Review
2.63 KB, patch
roc
: review+
Details | Diff | Splinter Review
1.34 KB, patch
roc
: review+
Details | Diff | Splinter Review
2.36 KB, patch
roc
: review+
Details | Diff | Splinter Review
4.02 KB, patch
roc
: review+
Details | Diff | Splinter Review
7.07 KB, patch
roc
: review+
Details | Diff | Splinter Review
3.56 KB, patch
roc
: review+
Details | Diff | Splinter Review
4.42 KB, patch
roc
: review+
Details | Diff | Splinter Review
1.75 KB, patch
Details | Diff | Splinter Review
2.87 KB, patch
roc
: review+
Details | Diff | Splinter Review
1.18 KB, patch
roc
: review+
Details | Diff | Splinter Review
Running a trunk build on Windows 7, without enabling the new dwrite/d2d features, I currently see 35 unexpected failures (and one unexpected pass), as shown in the attached log.

Most of these are spurious failures caused by subpixel-AA drawing/clipping issues, and might ultimately be resolved when we handle visual overflow areas better. Some are also related to Windows font updates (e.g., fonts that have had kerning added in Windows 7).

Where the failures are not related to the actual purpose of the test, but just artifacts of the updated font rendering, we should adjust them to make them more robust if possible.

Cleaning up these issues will make it easier to isolate and analyze the changes that result when we switch on dwrite and d2d rendering.
The Arabic text with LTR override (test 2) fails because Uniscribe does not shape overridden Arabic correctly; nothing we can do about that. Marking it random as it actually passes on current tinderboxes, but only because of inadequate font support, I think.
Assignee: nobody → jfkthame
Attachment #430565 - Flags: review?(roc)
The Hebrew text is problematic because there's no guarantee that a font will position separate diacritics identically to how they're placed in the precomposed "presentation form" characters.
Attachment #430566 - Flags: review?(roc)
The text change here is to avoid the word "Question", because even with the sans-serif font we get a problem due to *vertical* subpixel AA occurring at the tip of the stroke of the "Q".
Attachment #430581 - Flags: review?(roc)
I don't actually understand *why* this was failing with the original, small text, but using a bigger font makes the problem go away. (Hmm, maybe because it triggers our "quality" path?) We could raise a bug on the issue, though, if we want to investigate further.
Attachment #430583 - Flags: review?(roc)
Switching from serif to sans-serif makes sense in general, I think, but I'm not so keen on just changing the text or its size to avoid bugs. It smells like something that happens to work for the particular systems and fonts we have but won't necessarily be an improvement for others, might in fact be worse for others.

Are any of these other tests fixed by using the 'Ahem' font instead (without breaking the test)? Or 'monospace', which seems even less likely than 'sans-serif' to be affected by subpixel issues? Or can we use padding in some of them, which is really the best solution?
(In reply to comment #20)
> Are any of these other tests fixed by using the 'Ahem' font instead (without
> breaking the test)? Or 'monospace', which seems even less likely than
> 'sans-serif' to be affected by subpixel issues? Or can we use padding in some
> of them, which is really the best solution?

We can't use Ahem to reftest text changes like capitalization (part 17), as it will make them undetectable in the rendered output! I suppose it would be ok in theory for fallback past symbol fonts (18), but in general I'm reluctant to use it where it will make it difficult to see what's happening when actually viewing the test files.

Monospace is not particularly "safer" than sans; it does avoid kerning, which is why I used it for some of the bidi tests where Windows applies kerning to "natural" text but not to direction-override text. But as far as subpixel AA is concerned, it is just as susceptible to problems. I suppose if we used a monospace font, AND changed the text to be entirely "narrow" glyphs such as "i" and "l", it might be ok - though typical "typewriter" fonts give these very long serifs so that even they almost fill the glyph cell.

It's true that changing the text to avoid subpixel issues is not guaranteed to be a universal solution. But by understanding where the problems are occurring, I think it's possible to choose test characters that are much less likely to cause problems. In Part 17 (capitalization), the problem is the stroke of the Q, which descends slightly below the baseline and gets vertical AA on the bottom; this gets clipped inconsistently because the test and reference do their drawing in different chunks. So choosing text that doesn't have a downward-projecting point like this improves our chances. (Properly handling ink overflow areas should eliminate issues like this, as we'll be able to pad our drawing area to include the AA pixels without the risk of affecting layout.)

In part 12, the problem occurs where we have a "tt" in the text, and one of the test lines draws this in two separate operations, breaking between the t's. Because the "arms" of the letters come very close together in typical fonts, their subpixel AA areas end up overlapping, and we get a slightly different result from combining the two separate drawing operations than ClearType gives us for a continuous string. This isn't really a bug, IMO, it's just a fact of life. We can't use padding to give extra space here, as we're comparing to a continuous string of text, so the simplest solution is to use text where the glyphs at the "break" don't have elements that project so close to their neighbors.

So I think the textual changes in parts 12 and 17 are reasonable, and unlikely to make things worse anywhere. The most questionable one, IMO, is the size change in part 18. There's nothing "wrong" with this as such - the size of the text is completely irrelevant to the test - but I don't understand the original failure, which does bother me.
Comment on attachment 430566 [details] [diff] [review]
part 2 - resolve Win7 failure in bidi-004 reftest

Simon, could you see if you think the changes to the bidi-004 reftests are ok. I've changed the literal characters to entities, to make it easier to tell what's actually present, but more significantly, I've also removed the ALEPH + PATAH pair from the test.

The underlying issue here is that with the available fonts on Windows 7 (at least), the result of ALEPH + PATAH (as separate characters) positions the PATAH glyph slightly differently from how it appears in the presentation form glyph for U+FB2E, and so our comparison fails. The same discrepancy can be seen if I put the sequence <ALEPH, PATAH> and the presentation form <ALEPH WITH PATAH> into WordPad; it's a font issue, not a Gecko error.

In general, reftesting by comparing the rendering of a unicode combining sequence and a precomposed character is as much a test of the fonts and the OS text system as it is a test of Gecko; but it happens that this test works OK with the default Windows fonts if we remove that particular element from it.
Attachment #430566 - Flags: review?(roc) → review?(smontagu)
(In reply to comment #21)
> In part 12, the problem occurs where we have a "tt" in the text, and one of the
> test lines draws this in two separate operations, breaking between the t's.
> Because the "arms" of the letters come very close together in typical fonts,
> their subpixel AA areas end up overlapping, and we get a slightly different
> result from combining the two separate drawing operations than ClearType gives
> us for a continuous string. This isn't really a bug, IMO, it's just a fact of
> life. We can't use padding to give extra space here, as we're comparing to a
> continuous string of text, so the simplest solution is to use text where the
> glyphs at the "break" don't have elements that project so close to their
> neighbors.

What if we use letter-spacing?
Attachment #430566 - Flags: review?(smontagu) → review+
Comment on attachment 430566 [details] [diff] [review]
part 2 - resolve Win7 failure in bidi-004 reftest

The only problem with doing this is that we end up without sufficient testing for cases that we care about. I take your point that the test as it stands is as much a test of the font as the rendering system, but we need to think about a better way to test diacritic placement.
BTW, it might be worth experimenting with enabling bidi-004.html on other platforms. IIRC, the reason that I disabled it on Mac was that the ALEF WITH PATAH was being displayed in a different font, so maybe it will work there now.
(In reply to comment #25)
> BTW, it might be worth experimenting with enabling bidi-004.html on other
> platforms. IIRC, the reason that I disabled it on Mac was that the ALEF WITH
> PATAH was being displayed in a different font, so maybe it will work there now.

Right; with this change, it now "unexpectedly" passes on Mac, at least on tryserver. I realize removing the ALEF WITH PATAH weakens the test somewhat, but we do still have several other characters there, so it's better than nothing.
Yes, using letter-spacing is a neat alternative for part 12; by adding this, we avoid the antialiasing of the adjacent glyphs interfering with each other and thus becoming dependent on the exact details of the painting sequence.
Attachment #430576 - Attachment is obsolete: true
Attachment #430963 - Flags: review?(roc)
Attachment #430576 - Flags: review?(roc)
Letter-spacing can also be used in part 17, allowing us to stay with the default (serif) font and thus avoid the vertical antialiasing problem that occurs with the sans-serif capital Q.

(The fact that we don't get any vertical AA issues with the serif font and this particular text is of course just a lucky break. Until we really fix the paint overflow areas, this stuff will remain rather fragile and system-/font-dependent.)
Attachment #430581 - Attachment is obsolete: true
Attachment #430969 - Flags: review?(roc)
Attachment #430581 - Flags: review?(roc)
Oops, inadvertently included hacks to reftest.list in the previous version of the patch. Sorry for the spam.
Attachment #430969 - Attachment is obsolete: true
Attachment #430980 - Flags: review?(roc)
Attachment #430969 - Flags: review?(roc)
In part 18 (reftest bug-399636), the issue is that the reference files use the "fast" text path, without kerning, etc, but the testcases are hitting the "quality" path and on Win7, the font includes kerning that affects the glyph positions. That's why enlarging the text resolved the failure: at larger sizes, we always use the quality path.

So an alternative fix is to add text-rendering:optimizeLegibility to the styles, so that we get consistent kerning. I'm not actually sure why the testcases (where font fallback is occurring) are triggering the quality path even at smaller sizes - is that a bug, or expected behavior?
Attachment #430583 - Attachment is obsolete: true
Attachment #430988 - Flags: review?(roc)
Attachment #430583 - Flags: review?(roc)
Comment on attachment 430988 [details] [diff] [review]
part 18, v2 - force "quality" text mode to ensure consistent kerning in reftest 399636

I don't know why the testcases are taking the quality path.
Attachment #430988 - Flags: review?(roc) → review+
After pushing part 8, we're seeing an intermittent failure on tinderbox machines; the issue is clipping of a few antialiasing pixels on the top of the curve of the 'f'. This has only happened on Windows Debug machines; presumably has something to do with how the page gets updated when the font loads, and the Opt builds don't show it because they run faster.

As a possible fix, I suggest we add some padding to the testcase in the hope that this will prevent the clipping. (I can't reproduce it locally or on tryserver for testing.)
Attachment #431094 - Flags: review?(roc)
Log showing the failure on Windows debug reftest:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1268055646.1268058279.9999.gz
Hmm, also happened on Opt reftest, although we've had several successful runs:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1268061222.1268062200.24044.gz
Comment on attachment 431094 [details] [diff] [review]
part 8 followup - add padding to try and fix intermittent Rd failure in reftest 385569-1.

Landed this per ted's advice on irc, as we had several (intermittent) failures - hoping it will help.
http://hg.mozilla.org/mozilla-central/rev/1041bbdd97ea
Attachment #431094 - Flags: review?(roc)
This feels rather hacky, but ought to be ok as a workaround (until we fix ink overflow properly!) -- the reftest fails because of *vertical* antialiasing overflow at the tips of diagonal strokes. This isn't normally a problem, but the use of bidi override controls causes separate direction runs to be created, and the glyphs are then painted in isolation and clipped too tightly.

I've seen the same failure on Vista, too; it's not new with Win7. But it's likely to become an issue on tinderbox when we get Win7 test boxes.

This patch works around the problem by adding vertical bar (|) characters that do not affect the bidi behavior or mirroring (which is what's being tested here), but ensure that the tips of the < > glyphs are not the vertical extremes of the run that's being painted.

(Note that the issue doesn't occur if you view the reftest in a normal browser window; it only happens on the Cairo fallback path that gets used in the reftest framework. See bug 559879.)
Attachment #439591 - Flags: review?(roc)
The reftest text/cgj-01.html fails on Windows 7 because the default font used has a visible glyph for Combining Grapheme Joiner.

The test is currently mixing two issues: (1) CGJ should not be visible; (2) presence of CGJ should not affect the advance of the preceding cluster.

I suggest modifying the test to specify the Calibri font (if available - this will only be relevant on Windows), as this has an invisible glyph for CGJ and thus allows us to test only that CGJ does not affect the surrounding glyph placement.

We should then have a separate bug and testcase (which will currently fail on Win7) about ensuring that display of CGJ is always suppressed, even if the font has a visible glyph.
Attachment #453451 - Flags: review?(roc)
Yes, looks like we're done here.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: