Closed Bug 1823401 Opened 3 years ago Closed 3 years ago

Wrong values shown in PDF viewer

Categories

(Firefox :: PDF Viewer, defect, P1)

Firefox 111
defect

Tracking

()

VERIFIED FIXED
113 Branch
Tracking Status
firefox-esr102 112+ verified
firefox111 --- wontfix
firefox112 + verified
firefox113 + verified

People

(Reporter: filippoluchini, Assigned: bobowen)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(2 files, 1 obsolete file)

Attached file testcase.zip

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/111.0

Steps to reproduce:

Unzip the attached file, that contains two valid PDF/A-3B files and a png.
Open CASO_A1.pdf in firefox using its PDF viewer. Then open CASO_B1.pdf in another tab of firefox using its PDF viewer.

Actual results:

CASO_A1.pdf shows correct data. CASO_B1.pdf shows uncorrect data (it seems to change values of bold numeric characters, eg exams results and dates), though values are right when inspecting the page (see inspect.png).
When opening CASO_B1.pdf first and CASO_A1.pdf then, CASO_B1.pdf shows correct data and CASO_A1.pdf shows uncorrect data.

Expected results:

Both CASO_A1.pdf and CASO_B1.pdf show correct data.

The Bugbug bot thinks this bug should belong to the 'Firefox::PDF Viewer' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → PDF Viewer

I can reproduce the issue on Windows but not on mac.
I ran mozregression and I found the following pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=17e3c9ff2cad9711445a6c39df85925c8f347336&tochange=afa247928245d02b1305a1850eb39f19b84a2f35

and this patch looked like a good potential culprit:
Bob Owen — Bug 1547286: Enable remote Canvas 2D in Windows Nightly builds. r=jrmuizel

hence I set gfx.canvas.remote: false and the bug disappeared.

:jrmuizel, would you have any idea on what could be wrong ?

Flags: needinfo?(jmuizelaar)

moving NI to Bob

Flags: needinfo?(jmuizelaar) → needinfo?(bobowencode)

That is very weird.
Seems like if B is opened first bold numbers are translated like this:
0123456789
0738924561

If A is opened first it's the other way around.

Assignee: nobody → bobowencode
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Flags: needinfo?(bobowencode)

I tested those PDFs on a fresh Windows 11 install (VirtualBox) with Firefox 111 and bold numbers show up right.
On my Windows 10 laptop the issue can be reproduced.

OK my guess at the moment is that they are both using embedded fonts.
The key that we create to cache the font is the same, but for some reason the number glyphs are in different orders and so get translated.

The two pdfs contains two different fonts which are the two subsets of an identical font.
Each font is extracted from the pdf and then loaded in using a FontFace object.
For those pdfs the loaded fonts have the same name: g_d0_f2 but their content are different.
How is the key in the cache computed ?

(In reply to Calixte Denizet (:calixte) from comment #7)

The two pdfs contains two different fonts which are the two subsets of an identical font.
Each font is extracted from the pdf and then loaded in using a FontFace object.
For those pdfs the loaded fonts have the same name: g_d0_f2 but their content are different.
How is the key in the cache computed ?

It's a hash of the head table bytes.

jfkthame: hashing the head tables for the font cache key isn't working for these PDFs, any ideas for an improvement to fix this?

Flags: needinfo?(jfkthame)

Presumably hashing the entire font would substantially reduce the risk of a collision, but that could be significantly more expensive.

Maybe include the URL of the document to which the font belongs as part of the hash? That should avoid cross-document confusion.

Flags: needinfo?(jfkthame)

(In reply to Jonathan Kew [:jfkthame] from comment #10)

Presumably hashing the entire font would substantially reduce the risk of a collision, but that could be significantly more expensive.

Maybe include the URL of the document to which the font belongs as part of the hash? That should avoid cross-document confusion.

This isn't a hash collision the head table bytes and font data sizes are identical.
I'm not sure how easily we can get the URL.
Perhaps we could hash the entire font if is it under a certain size, to try and allow for these subsets.

Yes - sorry, I didn't mean we're seeing a "hash collision" in the usual sense; rather, a collision between the two subset fonts because they have the same 'head' table and the same length. I haven't checked, but my guess is they even contain the exact same subset of glyphs -- but in a different order, because they get added to the subset in the order in which they occur. So it's expected that the head-table hashes are identical. If we could include the "owning" document's URL in the hash, that would distinguish them.

If that's not feasible, maybe including the 'cmap' table in the hash would help; that should cause a difference if the glyphs are reordered. But I suppose we can't guarantee 100% reliability with anything less than the full font data.

Based on comment #2, this bug contains a bisection range found by mozregression. However, the Regressed by field is still not filled.

:bobowen, if possible, could you fill the Regressed by field and investigate this regression?

For more information, please visit auto_nag documentation.

Flags: needinfo?(bobowencode)
Keywords: regression

What is the cache for?

(In reply to Jeff Muizelaar [:jrmuizel] from comment #14)

What is the cache for?

So we don't send the font data over multiple times.
I originally added it for printing when we didn't have UnscaledFont only ScaledFont.
I was just thinking actually, do we need this since lsalzman (I think) changed this to serialize the UnscaledFont?
Meaning could we just cache using the underlying pointer like we do for other objects?

Flags: needinfo?(bobowencode) → needinfo?(jmuizelaar)
Regressed by: 1547286

I'm going to go with jfkthame's cmap plan (at least initially), it should fix issues like this one and is simple for uplifting.

Attachment #9324447 - Attachment is obsolete: true
Pushed by bobowencode@gmail.com: https://hg.mozilla.org/integration/autoland/rev/f60561f5faf4 Hash cmap as well as head tables for SFNTData::GetUniqueKey. r=jfkthame

Comment on attachment 9324361 [details]
Bug 1823401: Hash cmap as well as head tables for SFNTData::GetUniqueKey. r=jfkthame!

Beta/Release Uplift Approval Request

  • User impact if declined: If two PDF documents from the same process (domain, including file) have embedded subsets of fonts with the same name and exactly the same characters, but in a different order.
    The characters in the subsequently loaded document will get scrambled.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: See attached testcase and description.
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Fairly simple change that adds the cmap tables from the fonts into the existing hashing for the font data key.
  • String changes made/needed: None
  • Is Android affected?: No
Attachment #9324361 - Flags: approval-mozilla-beta?
Flags: qe-verify+
QA Whiteboard: [qa-triaged]
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 113 Branch

I have reproduced this issue using Firefox 113.0a1 (2023.03.20) on Windows 10 x64.
I can confirm this issue is fixed, I verified using Firefox 113.0a1 latest nightly (2023.03.23) on Windows 10, Ubuntu 22 and on macOS 12.
Tested with Windows 10 using Firefox 111.0.1, still affected.
Will have to verify on Fx 112.0b6 and on Fx 102.10.0esr official builds, once they're available.

Comment on attachment 9324361 [details]
Bug 1823401: Hash cmap as well as head tables for SFNTData::GetUniqueKey. r=jfkthame!

Approved for 112.0b6

Attachment #9324361 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Reproducible on a 2023-03-20 Nightly build on Windows 10.
Verified as fixed on Firefox 102.0b6(build ID: 20230323181038) on Windows 10, Ubuntu 22, macOS 12.

Based on comment 23 I changed the tracking flag for Firefox 111 to affected.

Pretty sure all these statuses got set to fixed by mistake early in the bug.
ESR102 is affected as far as I can tell from hg.mozilla.org

Comment on attachment 9324361 [details]
Bug 1823401: Hash cmap as well as head tables for SFNTData::GetUniqueKey. r=jfkthame!

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Quite a serious bug that can result in the incorrect information to be displayed on PDFS.
  • User impact if declined: If two PDF documents from the same process (domain, including file) have embedded subsets of fonts with the same name and exactly the same characters, but in a different order.
    The characters in the subsequently loaded document will get scrambled.
  • Fix Landed on Version: 113
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Fairly simply change that just adds the cmap font table bytes into an existing hash.
Attachment #9324361 - Flags: approval-mozilla-esr102?
Flags: needinfo?(jmuizelaar)

Comment on attachment 9324361 [details]
Bug 1823401: Hash cmap as well as head tables for SFNTData::GetUniqueKey. r=jfkthame!

Approved for 102.10esr.

Attachment #9324361 - Flags: approval-mozilla-esr102? → approval-mozilla-esr102+

I'm going to mark this as S1. There may be a workaround (with prefs), but it's not guaranteed that you immediately notice this bug exists, and if you don't, you could have something that can be loosely described as "dataloss". It's definitely a nasty bug.

Severity: -- → S1
Priority: -- → P1

Reproducible on a 2023-03-20 Nightly build on Windows 10.
Verified as fixed on Firefox-esr 102.10.0(build ID: 20230403141754) on Windows 10, Ubuntu 22, macOS 12.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
Blocks: 1829130
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: