Closed Bug 1605526 Opened 5 years ago Closed 5 years ago

Incorrect rendering of accented characters in tab title for PDF hotlinked from Dropbox

Categories

(Firefox :: PDF Viewer, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Firefox 73
Tracking Status
firefox71 --- wontfix
firefox72 --- wontfix
firefox73 --- fixed
firefox74 --- fixed
firefox75 --- fixed

People

(Reporter: microtherion, Unassigned)

References

Details

Attachments

(1 file)

Attached image Tab Title

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:71.0) Gecko/20100101 Firefox/71.0

Steps to reproduce:

Load https://www.dropbox.com/s/iu30xpltgqw7mdo/Bomba%20de%20coraz%C3%B3n.pdf?raw=1

Actual results:

The accented ó in the document title is rendered as a missing character in the tab title (see attached image)

Expected results:

Expected "corazón" to be displayed correctly.

Hi Matthias,

Thanks for the details. I was able to reproduce on MacOS 10.14.5 on Firefox Nightly version 73.0a1 (2019-12-27) (64-bit), Release 71.0 and Beta 72.0b5

I've chosen a component so that the issue is reviewed.

Best regards, Clara.

Status: UNCONFIRMED → NEW
Component: Untriaged → Layout: Text and Fonts
Ever confirmed: true
Product: Firefox → Core
Version: 71 Branch → Trunk

This is a text-encoding error in the PDF file (presumably caused by a bug in lilypond or ghostscript, which generated it), not a Firefox bug.

Looking into the PDF, there is a block of metadata that includes the document title: an extract from a hex dump shows

00022960  2f 70 64 66 27 3e 3c 64  63 3a 74 69 74 6c 65 3e  |/pdf'><dc:title>|
00022970  3c 72 64 66 3a 41 6c 74  3e 3c 72 64 66 3a 6c 69  |<rdf:Alt><rdf:li|
00022980  20 78 6d 6c 3a 6c 61 6e  67 3d 27 78 2d 64 65 66  | xml:lang='x-def|
00022990  61 75 6c 74 27 3e 42 6f  6d 62 61 20 64 65 20 63  |ault'>Bomba de c|
000229a0  6f 72 61 7a ef bf bd 6e  3c 2f 72 64 66 3a 6c 69  |oraz...n</rdf:li|
000229b0  3e 3c 2f 72 64 66 3a 41  6c 74 3e 3c 2f 64 63 3a  |></rdf:Alt></dc:|
000229c0  74 69 74 6c 65 3e 3c 2f  72 64 66 3a 44 65 73 63  |title></rdf:Desc|

Note that where the ó in "corazón" should be, we have the three bytes ef bf bd, which is the UTF-8 representation of the Unicode codepoint U+FFFD REPLACEMENT CHARACTER. And that's what shows up in the Firefox tab title.

A bit later in the file, we find the title in a different form:

00022b10  33 30 5a 30 30 27 30 30  27 29 0a 2f 43 72 65 61  |30Z00'00')./Crea|
00022b20  74 6f 72 28 4c 69 6c 79  50 6f 6e 64 20 32 2e 31  |tor(LilyPond 2.1|
00022b30  39 2e 38 33 29 0a 2f 54  69 74 6c 65 28 5c 33 37  |9.83)./Title(\37|
00022b40  36 5c 33 37 37 5c 30 30  30 42 5c 30 30 30 6f 5c  |6\377\000B\000o\|
00022b50  30 30 30 6d 5c 30 30 30  62 5c 30 30 30 61 5c 30  |000m\000b\000a\0|
00022b60  30 30 20 5c 30 30 30 64  5c 30 30 30 65 5c 30 30  |00 \000d\000e\00|
00022b70  30 20 5c 30 30 30 63 5c  30 30 30 6f 5c 30 30 30  |0 \000c\000o\000|
00022b80  72 5c 30 30 30 61 5c 30  30 30 7a 5c 30 30 30 5c  |r\000a\000z\000\|
00022b90  33 36 33 5c 30 30 30 6e  29 0a 2f 53 75 62 74 69  |363\000n)./Subti|
00022ba0  74 6c 65 28 29 0a 2f 43  6f 6d 70 6f 73 65 72 28  |tle()./Composer(|
00022bb0  45 64 64 69 65 20 50 61  6c 6d 69 65 72 69 29 0a  |Eddie Palmieri).|

Here, the /Title entry in the PDF dictionary is encoded as UTF16-BE, indicated by the \376\377 prefix, and the ó is (correctly) encoded as \000\363, or U+00F3. But apparently Firefox is relying on the XMP metadata block to provide the tab title, and there the ó has been replaced by U+FFFD.

Besides hex-dumping the PDF file, this can also be confirmed using an XMP metadata-viewing tool such as https://www.get-metadata.com; uploading the file there shows the incorrect "Title: Bomba de corazn" field among the displayed metadata.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → INVALID
Component: Layout: Text and Fonts → PDF Viewer
Depends on: 1606634
Product: Core → Firefox
Resolution: INVALID → FIXED
Target Milestone: --- → Firefox 73
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: