Wrong values shown in PDF viewer
Categories
(Firefox :: PDF Viewer, defect, P1)
Tracking
()
People
(Reporter: filippoluchini, Assigned: bobowen)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: regression)
Attachments
(2 files, 1 obsolete file)
235.17 KB,
application/x-zip-compressed
|
Details | |
48 bytes,
text/x-phabricator-request
|
diannaS
:
approval-mozilla-beta+
RyanVM
:
approval-mozilla-esr102+
|
Details | Review |
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/111.0
Steps to reproduce:
Unzip the attached file, that contains two valid PDF/A-3B files and a png.
Open CASO_A1.pdf in firefox using its PDF viewer. Then open CASO_B1.pdf in another tab of firefox using its PDF viewer.
Actual results:
CASO_A1.pdf shows correct data. CASO_B1.pdf shows uncorrect data (it seems to change values of bold numeric characters, eg exams results and dates), though values are right when inspecting the page (see inspect.png).
When opening CASO_B1.pdf first and CASO_A1.pdf then, CASO_B1.pdf shows correct data and CASO_A1.pdf shows uncorrect data.
Expected results:
Both CASO_A1.pdf and CASO_B1.pdf show correct data.
Comment 1•3 years ago
|
||
The Bugbug bot thinks this bug should belong to the 'Firefox::PDF Viewer' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Comment 2•3 years ago
|
||
I can reproduce the issue on Windows but not on mac.
I ran mozregression and I found the following pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=17e3c9ff2cad9711445a6c39df85925c8f347336&tochange=afa247928245d02b1305a1850eb39f19b84a2f35
and this patch looked like a good potential culprit:
Bob Owen — Bug 1547286: Enable remote Canvas 2D in Windows Nightly builds. r=jrmuizel
hence I set gfx.canvas.remote: false
and the bug disappeared.
:jrmuizel, would you have any idea on what could be wrong ?
Comment 3•3 years ago
|
||
moving NI to Bob
Assignee | ||
Comment 4•3 years ago
|
||
That is very weird.
Seems like if B is opened first bold numbers are translated like this:
0123456789
0738924561
If A is opened first it's the other way around.
Comment 5•3 years ago
|
||
I tested those PDFs on a fresh Windows 11 install (VirtualBox) with Firefox 111 and bold numbers show up right.
On my Windows 10 laptop the issue can be reproduced.
Assignee | ||
Comment 6•3 years ago
|
||
OK my guess at the moment is that they are both using embedded fonts.
The key that we create to cache the font is the same, but for some reason the number glyphs are in different orders and so get translated.
Comment 7•3 years ago
|
||
The two pdfs contains two different fonts which are the two subsets of an identical font.
Each font is extracted from the pdf and then loaded in using a FontFace
object.
For those pdfs the loaded fonts have the same name: g_d0_f2
but their content are different.
How is the key in the cache computed ?
Assignee | ||
Comment 8•3 years ago
|
||
(In reply to Calixte Denizet (:calixte) from comment #7)
The two pdfs contains two different fonts which are the two subsets of an identical font.
Each font is extracted from the pdf and then loaded in using aFontFace
object.
For those pdfs the loaded fonts have the same name:g_d0_f2
but their content are different.
How is the key in the cache computed ?
It's a hash of the head table bytes.
Assignee | ||
Comment 9•3 years ago
|
||
jfkthame: hashing the head tables for the font cache key isn't working for these PDFs, any ideas for an improvement to fix this?
Comment 10•3 years ago
|
||
Presumably hashing the entire font would substantially reduce the risk of a collision, but that could be significantly more expensive.
Maybe include the URL of the document to which the font belongs as part of the hash? That should avoid cross-document confusion.
Assignee | ||
Comment 11•3 years ago
|
||
(In reply to Jonathan Kew [:jfkthame] from comment #10)
Presumably hashing the entire font would substantially reduce the risk of a collision, but that could be significantly more expensive.
Maybe include the URL of the document to which the font belongs as part of the hash? That should avoid cross-document confusion.
This isn't a hash collision the head table bytes and font data sizes are identical.
I'm not sure how easily we can get the URL.
Perhaps we could hash the entire font if is it under a certain size, to try and allow for these subsets.
Comment 12•3 years ago
|
||
Yes - sorry, I didn't mean we're seeing a "hash collision" in the usual sense; rather, a collision between the two subset fonts because they have the same 'head' table and the same length. I haven't checked, but my guess is they even contain the exact same subset of glyphs -- but in a different order, because they get added to the subset in the order in which they occur. So it's expected that the head-table hashes are identical. If we could include the "owning" document's URL in the hash, that would distinguish them.
If that's not feasible, maybe including the 'cmap' table in the hash would help; that should cause a difference if the glyphs are reordered. But I suppose we can't guarantee 100% reliability with anything less than the full font data.
Comment 13•3 years ago
|
||
Based on comment #2, this bug contains a bisection range found by mozregression. However, the Regressed by
field is still not filled.
:bobowen, if possible, could you fill the Regressed by
field and investigate this regression?
For more information, please visit auto_nag documentation.
Comment 14•3 years ago
|
||
What is the cache for?
Assignee | ||
Comment 15•3 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #14)
What is the cache for?
So we don't send the font data over multiple times.
I originally added it for printing when we didn't have UnscaledFont
only ScaledFont
.
I was just thinking actually, do we need this since lsalzman (I think) changed this to serialize the UnscaledFont
?
Meaning could we just cache using the underlying pointer like we do for other objects?
Assignee | ||
Comment 16•3 years ago
|
||
I'm going to go with jfkthame's cmap plan (at least initially), it should fix issues like this one and is simple for uplifting.
Assignee | ||
Comment 17•3 years ago
|
||
Assignee | ||
Comment 18•3 years ago
|
||
Assignee | ||
Comment 19•3 years ago
|
||
Original Revision: https://phabricator.services.mozilla.com/D173252
Updated•3 years ago
|
Comment 20•3 years ago
|
||
Assignee | ||
Comment 21•3 years ago
|
||
Comment on attachment 9324361 [details]
Bug 1823401: Hash cmap as well as head tables for SFNTData::GetUniqueKey. r=jfkthame!
Beta/Release Uplift Approval Request
- User impact if declined: If two PDF documents from the same process (domain, including file) have embedded subsets of fonts with the same name and exactly the same characters, but in a different order.
The characters in the subsequently loaded document will get scrambled. - Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: Yes
- If yes, steps to reproduce: See attached testcase and description.
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Fairly simple change that adds the cmap tables from the fonts into the existing hashing for the font data key.
- String changes made/needed: None
- Is Android affected?: No
Assignee | ||
Updated•3 years ago
|
Updated•3 years ago
|
![]() |
||
Comment 22•3 years ago
|
||
bugherder |
Comment 23•3 years ago
|
||
I have reproduced this issue using Firefox 113.0a1 (2023.03.20) on Windows 10 x64.
I can confirm this issue is fixed, I verified using Firefox 113.0a1 latest nightly (2023.03.23) on Windows 10, Ubuntu 22 and on macOS 12.
Tested with Windows 10 using Firefox 111.0.1, still affected.
Will have to verify on Fx 112.0b6 and on Fx 102.10.0esr official builds, once they're available.
Comment 24•3 years ago
|
||
Comment on attachment 9324361 [details]
Bug 1823401: Hash cmap as well as head tables for SFNTData::GetUniqueKey. r=jfkthame!
Approved for 112.0b6
Comment 25•3 years ago
|
||
bugherder uplift |
Reproducible on a 2023-03-20 Nightly build on Windows 10.
Verified as fixed on Firefox 102.0b6(build ID: 20230323181038) on Windows 10, Ubuntu 22, macOS 12.
Comment 27•3 years ago
|
||
Based on comment 23 I changed the tracking flag for Firefox 111 to affected.
Assignee | ||
Comment 28•3 years ago
|
||
Pretty sure all these statuses got set to fixed by mistake early in the bug.
ESR102 is affected as far as I can tell from hg.mozilla.org
Assignee | ||
Comment 29•3 years ago
|
||
Comment on attachment 9324361 [details]
Bug 1823401: Hash cmap as well as head tables for SFNTData::GetUniqueKey. r=jfkthame!
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration: Quite a serious bug that can result in the incorrect information to be displayed on PDFS.
- User impact if declined: If two PDF documents from the same process (domain, including file) have embedded subsets of fonts with the same name and exactly the same characters, but in a different order.
The characters in the subsequently loaded document will get scrambled. - Fix Landed on Version: 113
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Fairly simply change that just adds the cmap font table bytes into an existing hash.
Updated•3 years ago
|
Comment 30•3 years ago
|
||
Comment on attachment 9324361 [details]
Bug 1823401: Hash cmap as well as head tables for SFNTData::GetUniqueKey. r=jfkthame!
Approved for 102.10esr.
Comment 31•3 years ago
|
||
bugherder uplift |
Comment 32•3 years ago
|
||
I'm going to mark this as S1. There may be a workaround (with prefs), but it's not guaranteed that you immediately notice this bug exists, and if you don't, you could have something that can be loosely described as "dataloss". It's definitely a nasty bug.
Updated•3 years ago
|
Updated•3 years ago
|
Reproducible on a 2023-03-20 Nightly build on Windows 10.
Verified as fixed on Firefox-esr 102.10.0(build ID: 20230403141754) on Windows 10, Ubuntu 22, macOS 12.
Description
•