<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Comment 6

•

5 years ago

(In reply to Jamie Nicol [:jnicol] from comment #5)

So it looks like remote Direct2D is having issues with its glyph cache perhaps.

Or maybe the webfonts they're using are not being successfully sent across to the canvas process, or it's not waiting for them to fully load before drawing the text? Does the display fix itself if you force it to re-draw, e.g. by zooming?

(As an aside, I have to wonder whether the site is legitimately allowed to use those fonts. They're deploying copies of San Francisco faces that apparently come from an iOS distribution:

 Copyright © 2015-2017 Apple Inc. All rights reserved.  Your use of the San Francisco font is subject to the terms of the applicable iOS Software License Agreement.

I'd be surprised if that license allows extracting them to use as webfonts.)

I guess another possibility, given that these look like fonts from an Apple OS distribution, is that there's something about them that makes them fail in Direct2D because they were not engineered to be cross-platform-compatible, though from a brief look I don't see any obvious reason there'd be an issue.

Jamie Nicol [:jnicol]

Comment 7

•

5 years ago

Your hypothesis seems more likely than mine - it appears to be consistent which text is broken, and a cache issue would probably be more nondeterministic.

Assignee

Comment 8

•

5 years ago

Well ... at least it's easy to reproduce, I'll look into it tomorrow.
Thanks for the report.

Assignee: nobody → bobowencode

Status: NEW → ASSIGNED

Flags: needinfo?(bobowencode)

Assignee

Comment 9

•

5 years ago

(In reply to Jonathan Kew (:jfkthame) from comment #6)
...

I guess another possibility, given that these look like fonts from an Apple OS distribution, is that there's something about them that makes them fail in Direct2D because they were not engineered to be cross-platform-compatible, though from a brief look I don't see any obvious reason there'd be an issue.

Looks like this is the case, the name tables seem to be garbled and we're pulling out single spaces for the font names.
We use that as part of the key to decide if a font is the same during the recording, so if we get two fonts like this with the same length then we mistakenly use the wrong font.

I think the simplest thing is to check that the name isn't just blank, then we will fall back to hashing the whole font file for our key.

Assignee

Comment 10

•

5 years ago

Attached file Bug 1644575: Check for blank names in SFNTNameTable::ReadU16NameFrom*. r=jfkthame! (obsolete) — Details

Comment 11

•

5 years ago

Hmm, that's a bit curious. Adding some checks here is probably a good thing, but I'd also like to check exactly what name we're trying to read here, as I don't see any empty names when looking at a ttx dump. I wonder if our name-table reading is broken in some way. (Or maybe there's something anomalous about the font, but ttx is hiding it from me...) Do you know what specific name ID/platform/encoding we're trying to read (and from which resource) when this happens?

Flags: needinfo?(bobowencode)

Assignee

Comment 12

•

5 years ago

(In reply to Jonathan Kew (:jfkthame) from comment #11)

Hmm, that's a bit curious. Adding some checks here is probably a good thing, but I'd also like to check exactly what name we're trying to read here, as I don't see any empty names when looking at a ttx dump. I wonder if our name-table reading is broken in some way. (Or maybe there's something anomalous about the font, but ttx is hiding it from me...) Do you know what specific name ID/platform/encoding we're trying to read (and from which resource) when this happens?

I don't know specifically, I'd have to debug further to work out what it is trying to read.

Given your previous comment, my suspicion is that this is a Mac font and isn't really a valid web font.
Maybe the name table is encoded using Mac OS Roman for example.

Flags: needinfo?(bobowencode)

Comment 13

•

5 years ago

The name tables in the SFProDisplay fonts they're using include both MacRoman (1:0:0) and Windows Unicode (3:1:0x409) strings, so in principle they ought to be readable AFAICS. Something smells a bit fishy here, and I'd like to get to the bottom of it. Guess I'll kick off a Windows build....

Assignee

Comment 14

•

5 years ago

(In reply to Jonathan Kew (:jfkthame) from comment #13)

The name tables in the SFProDisplay fonts they're using include both MacRoman (1:0:0) and Windows Unicode (3:1:0x409) strings, so in principle they ought to be readable AFAICS. Something smells a bit fishy here, and I'd like to get to the bottom of it. Guess I'll kick off a Windows build....

Hmm, that does seem like we should pick them up then.
I'll try and do some more debugging as well, but it might have to wait until Monday.

Assignee

Comment 15

•

5 years ago

The Windows entries seem to be wrong to me.
As far as I can tell the code is reading them out correctly.

Assignee

Comment 16

•

5 years ago

We attempt to read the Full name entry first: 0003 0001 0409 0004 0002 0073
So length of 1 (2/2), which gives us a space at that offset.

When that fails we try to concatenate the Font family and subfamily names (with a space).
The font family name entry is: 0003 0001 0409 0001 0000 002d
The length is zero so we fail.

Flags: needinfo?(jfkthame)

Comment 17

•

5 years ago

Which specific font is that you're looking at?

Flags: needinfo?(jfkthame) → needinfo?(bobowencode)

Comment 18

•

5 years ago

Ah, looking at this a bit more, I don't think it's actually the Apple SFPro fonts they've taken from iOS that are causing the problem; it's some other font(s) that are embedded in the binary blobs of the pages and loaded by way of creating data: URLs that are handed to the Font Loading API. These don't show up in the Inspector (as they're not linked to DOM content the inspector can traverse, nor retrieved via individual network requests we can watch), so capturing and examining them becomes a bit trickier.

The font that is associated with the garbled text on p27 of https://www.healthcareglobal.com/issuu?id=255, for example, seems to be known as font-family: f27_2 to the document. This looks like the name of a subset font that the desktop publishing program generated to embed in the PDF; the DTP system involved seems to be producing individual per-page subset fonts, and probably munges their original names along the way.

Flags: needinfo?(bobowencode)

Assignee

Comment 19

•

5 years ago

(In reply to Jonathan Kew (:jfkthame) from comment #18)

Ah, looking at this a bit more, I don't think it's actually the Apple SFPro fonts they've taken from iOS that are causing the problem; it's some other font(s) that are embedded in the binary blobs of the pages and loaded by way of creating data: URLs that are handed to the Font Loading API. These don't show up in the Inspector (as they're not linked to DOM content the inspector can traverse, nor retrieved via individual network requests we can watch), so capturing and examining them becomes a bit trickier.

The font that is associated with the garbled text on p27 of https://www.healthcareglobal.com/issuu?id=255, for example, seems to be known as font-family: f27_2 to the document. This looks like the name of a subset font that the desktop publishing program generated to embed in the PDF; the DTP system involved seems to be producing individual per-page subset fonts, and probably munges their original names along the way.

Yes, I had another look this morning and initially couldn't find the font.
However when I look at it in Chrome I see the data: URI fonts.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=170443bdf4160ac191ba6edeae88aab035e45a2e

Assignee

Comment 20

•

5 years ago

Comment 21

•

5 years ago

OK, I extracted and decoded a couple of the embedded subset fonts, and confirmed the <name> table has been munged in the process of generating the PDF from which this canvas drawing is derived. Here's a ttx dump of the names from f27_2:

  <name>
    <namerecord nameID="1" platformID="1" platEncID="0" langID="0x0" unicode="True">
      
    </namerecord>
    <namerecord nameID="2" platformID="1" platEncID="0" langID="0x0" unicode="True">
      Medium
    </namerecord>
    <namerecord nameID="3" platformID="1" platEncID="0" langID="0x0" unicode="True">
      FontForge 2.0 :   : 30-3-2020
    </namerecord>
    <namerecord nameID="4" platformID="1" platEncID="0" langID="0x0" unicode="True">
       
    </namerecord>
    <namerecord nameID="5" platformID="1" platEncID="0" langID="0x0" unicode="True">
      Version  
    </namerecord>
    <namerecord nameID="6" platformID="1" platEncID="0" langID="0x0" unicode="True">
      
    </namerecord>
    <namerecord nameID="1" platformID="3" platEncID="1" langID="0x409">
      
    </namerecord>
    <namerecord nameID="2" platformID="3" platEncID="1" langID="0x409">
      Medium
    </namerecord>
    <namerecord nameID="3" platformID="3" platEncID="1" langID="0x409">
      FontForge 2.0 :   : 30-3-2020
    </namerecord>
    <namerecord nameID="4" platformID="3" platEncID="1" langID="0x409">
       
    </namerecord>
    <namerecord nameID="5" platformID="3" platEncID="1" langID="0x409">
      Version  
    </namerecord>
    <namerecord nameID="6" platformID="3" platEncID="1" langID="0x409">
      
    </namerecord>
  </name>

Whitespace is a bit hard to see in the dump, but what we see is that the Family name (ID 1) is empty; the Full name (ID 4) is just a space; and the PostScript name (ID 6) is also empty. They've also stripped identifying details out of the Unique name (ID 3), which clearly used to have a name in the middle field, and probably the Version (ID 5), as it's now kinda meaningless.

It seems a bit questionable to me whether it ought to be allowed for the PS name, and perhaps Family name, to be empty strings; but I guess the spec does not explicitly prohibit this (AFAIK), so we'd better make sure we handle it. And indeed, it doesn't seem to be breaking anything, except for:

We use that as part of the key to decide if a font is the same during the recording, so if we get two fonts like this with the same length then we mistakenly use the wrong font.

So that's the real issue: we're identifying fonts by a combination of name & length. While that will usually work, this example proves that it's not sufficiently unique, and an "unlucky" document will fail because we get fonts mixed up. In this case, the fact that the font names were blanked out by the generator is not the key issue. The issue is that there are multiple different subsetted versions of the same original font, and it happens that two of the subsets ended up with exactly the same length. So even if the names hadn't been blanked out, the same problem would be happening (and the patch to check for "blank" names wouldn't help in that case).

I think this is a real risk: while in this example the subsetter was apparently configured to blank out font names, we can't assume that will always be the case. I see lots of documents with subsetted fonts that haven't been treated in this way. Or another trick people sometimes try is to replace the font name not with blank but with a single "unusual" character such as a smiley-face emoji, presumably intended to make it more difficult for people to extract usable font resources. Again, multiple same-length subsets of such a font would hit this same issue.

We really need to add some more likely-unique identifying information to the key we're using to identify fonts here. Obviously hashing the whole font file should be pretty safe (hash collisions would be vanishingly unlikely, whereas same-length subsets are much less so), but that's a bit expensive to be doing all the time. Maybe a compromise such as hashing the <head> table would be workable. That includes the OpenType checksumAdjustment field, so in effect it'd give us a checksum (though not a cryptographic hash) of the entire file without having to scan the whole file again. That should reduce the risk of collisions by several orders of magnitude, at least.

Assignee

Comment 22

•

5 years ago

(In reply to Jonathan Kew (:jfkthame) from comment #21)
...

We really need to add some more likely-unique identifying information to the key we're using to identify fonts here. Obviously hashing the whole font file should be pretty safe (hash collisions would be vanishingly unlikely, whereas same-length subsets are much less so), but that's a bit expensive to be doing all the time. Maybe a compromise such as hashing the <head> table would be workable. That includes the OpenType checksumAdjustment field, so in effect it'd give us a checksum (though not a cryptographic hash) of the entire file without having to scan the whole file again. That should reduce the risk of collisions by several orders of magnitude, at least.

I was wondering if we could hash some portion of the file, I was thinking of the table records themselves as they also include checksums.
But as the head table contains a full checksum that sounds like a plan, we could keep the inclusion of the length to create the uint64_t key.
I'll get onto it on Monday, thanks.

Phabricator Automation

Updated

•

5 years ago

Attachment #9157935 - Attachment is obsolete: true

Assignee

Comment 23

•

5 years ago

Attached file Bug 1644575: Change SFNTData::GetUniqueKey to use a hash of head tables within the font data. r=jfkthame! — Details

This is used instead of hashing the first full name.

https://treeherder.mozilla.org/#/jobs?repo=try&selectedTaskRun=KboITLF_SC-oJ5YPMlL7yQ.0&revision=e658de59d9a01b0e53aec5d31abf0ed744462e52

Assignee

Comment 24

•

5 years ago

Assignee

Comment 25

•

5 years ago

Attached file Bug 1644575 part 2: Remove now unused SFNT* files and functions. r=jfkthame! — Details

Depends on D80412

https://treeherder.mozilla.org/#/jobs?repo=try&revision=683b90564dfcd226100c2e34edf063be9a3dfdc5

Assignee

Comment 26

•

5 years ago

Pulsebot

Comment 27

•

5 years ago

Pushed by bobowencode@gmail.com: https://hg.mozilla.org/integration/autoland/rev/81f749c7b1ad Change SFNTData::GetUniqueKey to use a hash of head tables within the font data. r=jfkthame https://hg.mozilla.org/integration/autoland/rev/ff1782eb34f0 part 2: Remove now unused SFNT* files and functions. r=jfkthame

Narcis Beleuzu [:NarcisB]

Comment 28

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/81f749c7b1ad
https://hg.mozilla.org/mozilla-central/rev/ff1782eb34f0

Status: ASSIGNED → RESOLVED

Closed: 5 years ago

status-firefox79: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla79

Julien Cristau [:jcristau]

Comment 29

•

5 years ago

Remove canvas is disabled for late beta and release.

status-firefox78: affected → disabled

Daniel Cicas [:dcicas], Release QA

Updated

•

5 years ago

status-firefox-esr78: --- → disabled

Updated

•

5 years ago

Flags: qe-verify+

Oana Botisan, Desktop QA

Comment 30

•

5 years ago

I managed to reproduce the issue using an older version of Nightly from 2020-06-09 on Windows 10 x64. I verified the fix using latest Nightly 80.a1 and Firefox 79.0b2 on Windows 10x64. The issue is not reproducing anymore.

Status: RESOLVED → VERIFIED

status-firefox79: fixed → verified

status-firefox80: --- → verified

Comment 32

•

5 years ago

Given that this also affects print output (see dup'd bug 1655350), I believe Firefox 78 is in fact affected even though remote canvas is disabled there. So we should probably consider uplifting to 78-esr.

Bob, would you agree? If so please nominate appropriately.

status-firefox78: disabled → affected

status-firefox-esr78: disabled → affected

Flags: needinfo?(bobowencode)

Summary: Text gets scrambled in remote Direct2D canvas → Text gets scrambled in remote Direct2D canvas and in printed/pdf output

Assignee

Comment 33

•

5 years ago

Comment on attachment 9158102 [details]
Bug 1644575: Change SFNTData::GetUniqueKey to use a hash of head tables within the font data. r=jfkthame!

ESR Uplift Approval Request

If this is not a sec:{high,crit} bug, please state case for ESR consideration: This code is also used in printing and webrender, so could cause similar issues where the incorrect font is used.
Given that we're quite early in the ESR lifecycle it seems like something we would want fixed.
User impact if declined: When multiple fonts are used on a page, there are cases where Moz2D recording mistakes a font for an earlier font and uses the earlier one in error.
This could affect printing and certain images when using webrender.
Fix Landed on Version: 79
Risk to taking this patch: Low
Why is the change risky/not risky? (and alternatives if risky): Relatively straightforward patch, which just changes the hash we use to uniquely identify the font to something more reliable.
The second patch is a bit bigger, but is just deleting code made obsolete by the first one.
The patches apply cleanly to ESR78.
String or UUID changes made by this patch: None

Flags: needinfo?(bobowencode)

Attachment #9158102 - Flags: approval-mozilla-esr78?

Assignee

Updated

•

5 years ago

Attachment #9158132 - Flags: approval-mozilla-esr78?

Updated

•

5 years ago

status-firefox78: affected → wontfix

tracking-firefox-esr78: --- → 80+

Comment 34

•

5 years ago

Comment on attachment 9158102 [details]
Bug 1644575: Change SFNTData::GetUniqueKey to use a hash of head tables within the font data. r=jfkthame!

Fixes garbled text when printing. Approved for 78.2esr.

Attachment #9158102 - Flags: approval-mozilla-esr78? → approval-mozilla-esr78+

Updated

•

5 years ago

Attachment #9158132 - Flags: approval-mozilla-esr78? → approval-mozilla-esr78+