pdf.js doesn't display and print some characters (e.g. German umlauts)

VERIFIED FIXED in Firefox 20

Status

()

--
major
VERIFIED FIXED
6 years ago
4 years ago

People

(Reporter: whimboo, Assigned: bdahl)

Tracking

({reproducible})

Trunk
Firefox 21
reproducible
Points:
---

Firefox Tracking Flags

(firefox19- affected, firefox20- verified, firefox21- verified)

Details

(Whiteboard: [pdfjs-c-rendering][pdfjs-f-fixed-upstream], URL)

Attachments

(2 attachments)

(Reporter)

Description

6 years ago
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:15.0) Gecko/15.0 Firefox/15.0a1

Opening the following PDF doesn't display any German umlauts:
https://dok.dkb.de/pdf/aenderung_l_sepa.pdf
File generated by 'iText 5.0.1'.
OS: Mac OS X → All
Hardware: x86 → All
Version: 12 Branch → Trunk
(Assignee)

Updated

6 years ago
Whiteboard: [pdfjs-c-rendering][pdfjs-d-font-conversion]

Comment 2

6 years ago
The same happens with http://www.intraprenör.de/start/assets/expose.pdf and http://www.intraprenör.de/start/assets/expose-old.pdf. Plus, e. g. on page 2 of the second link, the opening quotes are not displayed as well. Mac Preview does not have any problems with displaying the PDFs correctly. 

(Another thing: e. g. the lettering "intraprenör" on every page of the second PDF is rather edge-y and does not have a smooth outline. Again, Mac Preview displays this well.)

A warning is displayed above the reader that indicates the PDF may look broken, which is correct and good, but I'd rather see this fixed *before* releasing this feature always-on!
(Reporter)

Comment 3

6 years ago
This issue majorly affects the readiness and printing of PDF files for German speaking users. Not sure what we can block on but requesting so for Firefox 19. Release drivers can make the decision. Personally I don't think it's good to promote a new feature in a country which is one with the percentaged most users of Firefox. Other European countries may be affected too.
Severity: normal → major
tracking-firefox19: --- → ?
tracking-firefox20: --- → ?
tracking-firefox21: --- → ?
Summary: pdf.js doesn't display German umlauts → pdf.js doesn't display and print German umlauts

Comment 4

6 years ago
Seconding: Remember that this affects millions of users, in at least the DACH region. Plus, these issues may point to an underlying problem with non-ASCII (or non-English) characters which will *affect maybe >90% of Firefox users*!
(Reporter)

Comment 5

6 years ago
Created attachment 704807 [details]
working example

It may depend on how the PDF document has been created. I printed the extreme UTF-8 table as PDF on OS X and umlauts are displayed correctly.

Comment 6

6 years ago
It may depend on the used fonts, too (i. e. custom fonts vs. built-in fonts, and certain specifics of some fonts). 

I'm already trying to get technical infos about the documents I mentioned. I will update this bug if I learn something new. I /guess/ the software used for creating these PDFs (the software the person who gave me the PDFs usually uses) was either an Adobe product (e. g. Illustrator) or something like QuarkXPress (unlikely), quite possibly on OS X.

Comment 7

6 years ago
OK, at least some info: The used fonts should be (for the most part) PT SANS and LOBSTER TWO (I think the latter one is used for "intraprenör", i. e. the title). Software used was InDesign CS6 (should be InDesign version 8).

Comment 8

6 years ago
(In reply to Florian Bender from comment #2)
> http://www.intraprenör.de/start/assets/expose-old.pdf.

Interestingly, on page 3, first paragraph, a single Umlaut is displayed correctly, it reads "Universit t der Künste Berlin".

Comment 9

6 years ago
We have a few options here:

1) Determine that this isn't impacting a large portion of DACH PDFs or isn't user critical for some other reason, and leave the this unfixed
2) Resolve in time for Beta 4, going to build 1/29
3) Disable PDF.js for DACH to start

Brendan - can you take a look?

Updated

6 years ago
Assignee: nobody → bdahl
tracking-firefox19: ? → +
tracking-firefox20: ? → +
tracking-firefox21: ? → +

Updated

6 years ago
Keywords: reproducible

Comment 10

6 years ago
I'm seeing this with quotes in the forementioned doc, too. Notably, '„' doesn't display right.

suspicion without reason, the content is in deflated streams in those documents, and maybe there's some encoding foo between what pdfjs assumes and what's encoded in the stream?

If so, it'd may affect any non-pure-ascii script.
(Assignee)

Comment 11

6 years ago
(In reply to Alex Keybl [:akeybl] from comment #9)
> We have a few options here:
> 
> 1) Determine that this isn't impacting a large portion of DACH PDFs or isn't
> user critical for some other reason, and leave the this unfixed
> 2) Resolve in time for Beta 4, going to build 1/29
> 3) Disable PDF.js for DACH to start
> 
> Brendan - can you take a look?

We have a fix for this, but unfortunately the fix touches quite a bit of pdf.js code and we don't feel comfortable with sending this directly into beta.  I'm still unsure how widespread this problem since this doesn't affect all DACH documents, only ones with specific types of fonts and encodings.

Could we maybe get someone in Germany to try and come with a list of some important PDFS, such as government pdfs, popular bank pdfs, and any other pdfs of note.  If this is too late I'm fine with disabling for DACH.
(In reply to Brendan Dahl from comment #11)
> (In reply to Alex Keybl [:akeybl] from comment #9)
> > We have a few options here:
> > 
> > 1) Determine that this isn't impacting a large portion of DACH PDFs or isn't
> > user critical for some other reason, and leave the this unfixed
> > 2) Resolve in time for Beta 4, going to build 1/29
> > 3) Disable PDF.js for DACH to start
> > 
> > Brendan - can you take a look?
> 
> We have a fix for this, but unfortunately the fix touches quite a bit of
> pdf.js code and we don't feel comfortable with sending this directly into
> beta.  I'm still unsure how widespread this problem since this doesn't
> affect all DACH documents, only ones with specific types of fonts and
> encodings.
> 
> Could we maybe get someone in Germany to try and come with a list of some
> important PDFS, such as government pdfs, popular bank pdfs, and any other
> pdfs of note.  If this is too late I'm fine with disabling for DACH.

Let's prepare the disable and get it into beta 4 (needs to land on mozilla-beta before the end of today). If we find a way to more scientifically determine that #1 holds true, we can back it out.
(Reporter)

Comment 13

6 years ago
I think that I can trigger a test activity in our German forums for that topic. Not sure how many would participate but most of them are really active. So once it is in the code base and shipped please let me know.
(In reply to Henrik Skupin (:whimboo) from comment #13)
> I think that I can trigger a test activity in our German forums for that
> topic. Not sure how many would participate but most of them are really
> active. So once it is in the code base and shipped please let me know.

PDF.js is already in FF19 on beta - would you mind sending a note seeing if the way these characters are being displayed is confusing German users?

Comment 15

6 years ago
I somehow suspect that German umlauts aren't the only characters affected by this, I'd assume that some other locales with more non-English characters may be affected even worse, so IMHO we need testing across a larger collection of documents across languages to make sure of that.

I also think that shipping with different feature sets on different locales is bad for PR as people will miss heavily announced features in their Firefox.
Spoke with bdahl

"akeybl: for bug 761539
akeybl: do we have an option 4? I remember a string/infobar that said something like
"You can open this PDF in an external viewer if it's not loading properly"
bdahl: we have that bar, i'm asking yury if we can detect this issue and trigger it
bdahl: that bar is actually shown already on umlaut pdf"

given that, I'm of the opinion that nothing needs to be done here for FF19.

Updated

6 years ago
tracking-firefox19: + → -
tracking-firefox20: + → -
tracking-firefox21: + → -

Comment 17

6 years ago
I'll look into some more PDFs, but it will take a couple of days. 

(In reply to Alex Keybl [:akeybl] from comment #14)
> the way these characters are being displayed is confusing German users?
Well, the characters are simply missing. 

Please remember that the PDFs I mentioned include at least one other confirmed problem: The opening quote character is (sometimes) missing, too. Thus, this problem is probably more widespread that "just" missing Umlauts. 

Additionally, the PDFs I mentioned were created by Adobe InDesign, one of the industry-leading programs for creating PDFs. This has a bigger impact than if the "offending" program was a seldom used one. 


When will the fix be shipped? If Beta (19) is not feasible, let's try to bump it up to Aurora.
(Reporter)

Updated

6 years ago
Summary: pdf.js doesn't display and print German umlauts → pdf.js doesn't display and print some characters (e.g. German umlauts)

Comment 18

6 years ago
(In reply to Florian Bender from comment #17)
> When will the fix be shipped? 

I don't think a fix has even been published for review yet (bdahl, is it out somewhere already?) and it's not even in Nightly at this time, so pushing it to any other channel can't be decided yet, this needs some testing on Nightly first in any case - and then, it all depends on how risky a patch is.
From what comment #11 says and given that we're already pretty late on Beta 19 (b4 is being created right now), there's probably no chance at all for taking it there, which is the real concern at this time as it's where we plan to ship PDF.js on by default the first time.
We have more time for 20, which is on Aurora now, but a risk assessment will be needed there as well.

Comment 19

6 years ago
Sorry, I thought it already landed. Can you mark the related bug as dependant, or make this bug the "fixing" bug (if this makes sense at all)?
(Assignee)

Comment 20

6 years ago
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #18)
> (In reply to Florian Bender from comment #17)
> > When will the fix be shipped? 
> 
> I don't think a fix has even been published for review yet (bdahl, is it out
> somewhere already?) and it's not even in Nightly at this time, so pushing it
> to any other channel can't be decided yet, this needs some testing on
> Nightly first in any case - and then, it all depends on how risky a patch is.
> From what comment #11 says and given that we're already pretty late on Beta
> 19 (b4 is being created right now), there's probably no chance at all for
> taking it there, which is the real concern at this time as it's where we
> plan to ship PDF.js on by default the first time.
> We have more time for 20, which is on Aurora now, but a risk assessment will
> be needed there as well.

It hasn't landed yet, but it should be fixed today in the pdf.js repo. The work is happening in https://github.com/mozilla/pdf.js/pull/2606

Updated

6 years ago
Whiteboard: [pdfjs-c-rendering][pdfjs-d-font-conversion] → [pdfjs-c-rendering][pdfjs-f-fixed-upstream] https://github.com/mozilla/pdf.js/pull/2606
(Reporter)

Updated

6 years ago
Whiteboard: [pdfjs-c-rendering][pdfjs-f-fixed-upstream] https://github.com/mozilla/pdf.js/pull/2606 → [pdfjs-c-rendering][pdfjs-f-fixed-upstream]
(Reporter)

Comment 21

6 years ago
Boarding passes for Lufthansa have the same problem and we do not show the notification to open the PDF in another viewer.

Comment 22

6 years ago
I checked PDFs issued by various (German) governmental institutions, they all seemed fine (the infobar is displayed nonetheless). I did not check bank / insurance / large companies (but see Comment 21 and bank document at https://dok.dkb.de/pdf/aenderung_l_sepa.pdf). 


Two options IMO: 
A) Ship the PDF viewer as-is, PR should clearly state that there may sometimes be issues with certain special characters which will be fixed in the upcoming (!) release, show the infobar more aggressively (e. g. see Comment 21). Get the fix landed and uplift to Aurora ASAP (if Aurora 20 is not feasible, try it in Aurora 21 first and uplift to Beta 20 in time). 

B) Do not ship, PR should state that PDF viewer had issues with special characters and will finally be included in Firefox 20. Land fix and uplift to Aurora ASAP (if Aurora 20 is not feasible, try it in Aurora 21 first and uplift to Beta 20 in time).


I think it's crucial to issue a PR indicating issues with PDF viewer for Release 19 in either case! This feature has been announced widely, and postponing it yet again or shipping a flawed viewer without a notice in the Release 19(!) PR will lead to bad press and user disappointment. In the contrary, with a proper PR statement you may excite current and new Fx users for the big new feature coming up in Fx 20!

Comment 23

6 years ago
From what I see here and elsewhere currently, IMHO we should 1) just ship it in 19 but 2) show the infobar more aggressively (maybe even always) and 3) include comments in the release notes that some PDF documents may not be displayed correctly but people can use an external reader for those, and 4) have a SUMO article ready for this. Oh, and 5) try to get the fix ASAP so we can discuss uplift to 20.
Duplicate of this bug: 828256
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #23)
> From what I see here and elsewhere currently, IMHO we should 1) just ship it
> in 19 but 2) show the infobar more aggressively (maybe even always) and 3)
> include comments in the release notes that some PDF documents may not be
> displayed correctly but people can use an external reader for those, and 4)
> have a SUMO article ready for this. Oh, and 5) try to get the fix ASAP so we
> can discuss uplift to 20.

I agree with all of these initiatives but would add one:
6) Be prepared to hotfix disable PDF.js in Firefox 19 if negative feedback is explosive.

Comment 26

6 years ago
As Bug 738952 and Bug 806057 (and this one) show, PDF.js is not ready for prime time yet. Disabling it with a hotfix if there is negative feedback, is IMO a flawed solution. Just do it well, or not at all. I'd suggest pref'ing it off for release, work out the last kinks (two bugs are already fixed upstream), and release a robust version of PDF.js with Fx 20. Otherwise this will be a PR disaster (maybe not any of those bugs on their own but certainly all of them combined!). Please don't take that risk!
(Assignee)

Comment 27

6 years ago
Created attachment 713185 [details] [diff] [review]
cmap fix v1

[Approval Request Comment]
Bug caused by (feature/regressing bug #): n/a
User impact if declined: Certain letters with specific fonts will not display correctly in PDFs
Testing completed (on m-c, etc.): already in m-c
Risk to taking this patch (and alternatives if risky): low - we've had this upstream fix in for over a week and have found no new regressions yet 
String or UUID changes made by this patch: none

Try Builds for testers:
https://tbpl.mozilla.org/?tree=Try&rev=862f1f8820a6
Attachment #713185 - Flags: review?(ydelendik)
Attachment #713185 - Flags: approval-mozilla-aurora?
Comment on attachment 713185 [details] [diff] [review]
cmap fix v1

Review of attachment 713185 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good
Attachment #713185 - Flags: review?(ydelendik) → review+
Comment on attachment 713185 [details] [diff] [review]
cmap fix v1

Happy to see this'll be fixed in the next release!
Attachment #713185 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(Assignee)

Updated

6 years ago
Keywords: checkin-needed
https://hg.mozilla.org/releases/mozilla-aurora/rev/507588b3d04c

Fixed on trunk by bug 835954.
Status: NEW → RESOLVED
Last Resolved: 6 years ago
status-firefox20: --- → fixed
status-firefox21: --- → fixed
Depends on: 835954
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 21
(Reporter)

Updated

6 years ago
status-firefox19: --- → affected
(Reporter)

Comment 31

6 years ago
This looks great now with Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:20.0) Gecko/20130219 Firefox/20.0 ID:20130219042021.
status-firefox20: fixed → verified
Thanks a lot Henrik. If you have the time, can you please verify this against Firefox 21.0a1 builds as well?
(Reporter)

Comment 33

6 years ago
Looks fine with Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:21.0) Gecko/20130220 Firefox/21.0 ID:20130220042017
Status: RESOLVED → VERIFIED
status-firefox21: fixed → verified

Comment 34

4 years ago
I suggest to reopen.
The bug is still present with Firefox 35.0 / Windows 2008 R2
(Sample docs are meanwhile "404 File not Found", so have a try with 
http://www.redeker.de/makepdf.php/main-V2.php/de/anwalt/vita.php?anw=182&pdfdesign=xvita&pdftype=.pdf 
)
(Reporter)

Comment 35

4 years ago
(In reply to Hagen von Eitzen from comment #34)
> The bug is still present with Firefox 35.0 / Windows 2008 R2
> (Sample docs are meanwhile "404 File not Found", so have a try with 
> http://www.redeker.de/makepdf.php/main-V2.php/de/anwalt/vita.
> php?anw=182&pdfdesign=xvita&pdftype=.pdf 
> )

If that is really a problem please file a new bug.
(In reply to Hagen von Eitzen from comment #34)
> I suggest to reopen.
> The bug is still present with Firefox 35.0 / Windows 2008 R2
> (Sample docs are meanwhile "404 File not Found", so have a try with 
> http://www.redeker.de/makepdf.php/main-V2.php/de/anwalt/vita.
> php?anw=182&pdfdesign=xvita&pdftype=.pdf 
> )

This issue does not reproduce for me with Firefox 35 using that PDF on Windows 7, Windows 8.1, Mac OS 10.10, nor Ubuntu 14.04. I suggest you start with posting your issue to https://support.mozilla.org so they can help you troubleshoot what's happening. If it's discovered through troubleshooting that a bug exists then we'll need a new bug report to fix it.

Thanks
You need to log in before you can comment on or make changes to this bug.