Text layer and canvas don't always match in the pdf viewer on Windows
Categories
(Firefox :: PDF Viewer, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr128 | --- | unaffected |
firefox131 | --- | unaffected |
firefox132 | --- | unaffected |
firefox133 | --- | fixed |
People
(Reporter: calixte, Assigned: jfkthame)
References
(Regression)
Details
(Keywords: regression)
Attachments
(3 files)
Today the pdf.js windows tests shew some failures (see [1]) because of some mismatches between the text layer and the canvas where a page is rendered.
This text layer, on top of the canvas, is here for the text selection and for highlighting search results, so it's really a bad bug.
In using mozregression on the bot, I found that:
https://bugzilla.mozilla.org/show_bug.cgi?id=1919906
caused this regression.
The bot is running on Windows Server 2016 Datacenter
.
The text layer is composed of spans and they're rescaled in order to fit the corresponding text bounding box. In order to compute the scale factor, the text is measured thanks to canvas::measureText:
https://github.com/mozilla/pdf.js/blob/3cdc3258d94a06af658e50290bf709426adf44b8/src/display/text_layer.js#L403-L420
[1] http://54.193.163.58:8877/5e6e7c3a9150063/reftest-analyzer.html#web=eq.log
Comment 1•2 months ago
|
||
Set release status flags based on info from the regressing bug 1919906
Reporter | ||
Comment 2•2 months ago
|
||
Finally I can reproduce the issue locally on Windows 11: it depends on the zoom level.
It's reproducible in selecting some text https://github.com/mozilla/pdf.js/blob/master/test/pdfs/tracemonkey.pdf and with a zoom level set at 70% (my devicePixelRatio is 1.5).
Would it be possible to enforce allowForceGDIClassic
set to true
in the pdf.js case ?
Assignee | ||
Comment 3•2 months ago
|
||
(In reply to Calixte Denizet (:calixte) from comment #2)
Finally I can reproduce the issue locally on Windows 11: it depends on the zoom level.
It's reproducible in selecting some text https://github.com/mozilla/pdf.js/blob/master/test/pdfs/tracemonkey.pdf and with a zoom level set at 70% (my devicePixelRatio is 1.5).
That's as expected, because the force-gdi pref has a size threshold, so at larger (device-pixel) sizes it has never been involved.
Would it be possible to enforce
allowForceGDIClassic
set totrue
in the pdf.js case ?
Yeah, I guess we could do that; we have code to check whether we're dealing with a pdf.js document. This whole thing is kinda hackish anyhow, but we can have the canvas context check whether it's being used for pdf.js, and if so, continue to allow the force-gdi setting. That should restore the previous behavior; and pdf.js doesn't care about the potential canvas2d measurement vs rendering-time glyph spacing mismatch (as seen in bug 1919906) because it places each glyph individually anyhow.
Assignee | ||
Comment 4•2 months ago
|
||
Updated•2 months ago
|
Comment 5•2 months ago
|
||
(In reply to Jonathan Kew [:jfkthame] from comment #3)
Would it be possible to enforce
allowForceGDIClassic
set totrue
in the pdf.js case ?Yeah, I guess we could do that; we have code to check whether we're dealing with a pdf.js document.
It seems to me that that won't actually help when PDF.js is used "outside" of the Firefox PDF Viewer though (which applies even to the PDF.js test-suite that we use at GitHub)?
Comment 6•2 months ago
|
||
Yes, there are many websites, prominent ones too, using pdf.js as a library.
Assignee | ||
Comment 7•2 months ago
|
||
Hmm -- do we have any way to detect within Gecko when it's being used like that?
Comment 8•2 months ago
|
||
Could we disable GDI Classic on transparent HTML text (like the text layers in pdf.js)?
Assignee | ||
Comment 9•2 months ago
|
||
(In reply to Gregory Pappas [:gregp] from comment #8)
Could we disable GDI Classic on transparent HTML text (like the text layers in pdf.js)?
That would break the (reasonable) assumption that simply modifying color does not affect layout.
Assignee | ||
Comment 10•2 months ago
|
||
(In reply to Jonathan Kew [:jfkthame] from comment #7)
Hmm -- do we have any way to detect within Gecko when it's being used like that?
And if not, what other options do we have? A few possibilities come to mind:
(a) We could "fix" bug 1919906 in the opposite way: instead of preventing force-gdi-classic being applied to canvas text, we could allow it to take effect for both measurement and rendering. (The issue in bug 1919906 was that it was being applied during measurement, but not during rendering, resulting in glyph spacing that doesn't work for the actual rendered glyphs.)
The risk here is that we may regress the Google Docs issues that originally caused us to disable forced-gdi-classic glyph rendering in canvas (bug 1730772).
(b) Eventually, we'd like to completely remove the force-gdi-classic "hack" for certain legacy Windows fonts; see bug 1920733. That should resolve the problem here, because then neither the canvas measureText operation nor the HTML text rendering will be forcing a rendering mode, so they should both behave the same.
The risk here is that we're not sure how quickly we'll be able to deploy such a removal; the change to the rendering of certain (common) legacy fonts may generate a significant backlash.
(c) Here's an idea: I see that the (invisible) text layer in pdf.js is using font-family: sans-serif
. That explains why it is affected by the bug 1730772 change: sans-serif
on Windows resolves to Arial, and Arial is one of the "classic" Windows fonts listed in gfx.font_rendering.cleartype_params.force_gdi_classic_for_families
, so that's why its metrics are affected by the change, and there is now a canvas vs HTML text mismatch. Maybe pdf.js could prefer a different Windows font for the text layer (with sans-serif
remaining as fallback, or for other platforms). If it were to use, for instance, font-family: Calibri, sans-serif
, then any modern Windows system should use Calibri, which isn't affected by the force-gdi-classic setting, and so the metrics and scaling computation ought to work as intended.
Calixte, WDYT about option (c) here -- would that be a reasonable workaround, at least until such time as we're able to drop the force-gdi-classic behavior altogether? Given that the text layer is invisible, it shouldn't really matter exactly which font is used, right?
Comment 11•2 months ago
•
|
||
(In reply to Jonathan Kew [:jfkthame] from comment #10)
(c) Here's an idea: I see that the (invisible) text layer in pdf.js is using
font-family: sans-serif
.
Please note that it's just one of the possible font-families that can be used in the textLayer: https://github.com/mozilla/pdf.js/blob/567df4214983a12da9f3d02a2ab1ea646f0dfcf1/src/core/fonts.js#L996-L1002
Assignee | ||
Comment 12•2 months ago
|
||
(In reply to Jonas Jenwald [:Snuffleupagus] from comment #11)
(In reply to Jonathan Kew [:jfkthame] from comment #10)
(c) Here's an idea: I see that the (invisible) text layer in pdf.js is using
font-family: sans-serif
.Please note that it's just one of the possible font-families that can be used in the textLayer: https://github.com/mozilla/pdf.js/blob/567df4214983a12da9f3d02a2ab1ea646f0dfcf1/src/core/fonts.js#L996-L1002
There shouldn't be an issue with serif
, as Times isn't affected by the gdi-classic pref. Presumably monospace
could be an issue, though, as both Courier New and Consolas are listed in the pref.
Reporter | ||
Comment 13•2 months ago
|
||
Calixte, WDYT about option (c) here -- would that be a reasonable workaround, at least until such time as we're able to drop the force-gdi-classic behavior altogether? Given that the text layer is invisible, it shouldn't really matter exactly which font is used, right?
What could be a replacement for monospace
?
Do we know how many users have a Windows version without Calibri ?
Assignee | ||
Comment 14•2 months ago
|
||
(In reply to Calixte Denizet (:calixte) from comment #13)
Calixte, WDYT about option (c) here -- would that be a reasonable workaround, at least until such time as we're able to drop the force-gdi-classic behavior altogether? Given that the text layer is invisible, it shouldn't really matter exactly which font is used, right?
What could be a replacement for
monospace
?
"Lucida Console" should be OK.
Do we know how many users have a Windows version without Calibri ?
Virtually none, I suspect; I believe Calibri has been a standard font since Windows 7 (or Vista?).
Comment 15•2 months ago
|
||
Comment 16•2 months ago
|
||
Note if we work around this in pdf.js, we'd still see the bug in external users until they upgrade to the version with the workaround (and they're are often slow in upgrading).
Reporter | ||
Comment 17•2 months ago
|
||
We don't have any exact idea about how many people are using pdf.js in an other context that the built-in viewer, but it's probably a lot.
Reporter | ||
Comment 18•2 months ago
•
|
||
Maybe it's a stupid idea, but on Windows, when the font is sans-serif
, why not choosing Calibri
before Arial
? Are we "obliged" to select Arial
?
I mean in general and not only in pdf.js.
Assignee | ||
Comment 19•2 months ago
|
||
(In reply to Calixte Denizet (:calixte) from comment #18)
Maybe it's a stupid idea, but on Windows, when the font is
sans-serif
, why not choosingCalibri
beforeArial
? Are we "obliged" to selectArial
?
I mean in general and not only in pdf.js.
I don't think we're obliged to use Arial, it's just that it has been the default for a very long time, and changing a default like that will no doubt cause a certain amount of disruption. Also I think Edge and Chrome still use Arial (I'm not on Windows right now to check...), in which case changing our sans-serif
to a different font might add some minor interop pain.
(Personally, I like Calibri better than Arial for general use. Still, I'd want to tread carefully if we consider changing such a fundamental default.)
Reporter | ||
Comment 20•2 months ago
|
||
On Windows 11, Chrome and Edge are using Arial when the font is sans-serif
.
Updated•2 months ago
|
Description
•