Closed Bug 1122280 Opened 5 years ago Closed 5 years ago

PDF.js parses links within PDF file incorrectly and results on file not found errors


(Firefox :: PDF Viewer, defect, P3)

Windows 7



Firefox 41


(Reporter: alex_mayorga, Assigned: hellemar)



(Whiteboard: [pdfjs-c-ux][pdfjs-d-annotations][pdfjs-f-fixed-upstream]

- Load
- Click any of the links within the PDF file

All the links are broken on Nightly.

Expected result:
All the links work like on IE.
Looks like something is failing with certain unicode characters.
Priority: -- → P3
Whiteboard: [pdfjs-c-ux][pdfjs-d-annotations][good first bug]
this is a windows only issue as nightly and current release on OSX work find
This is not a Windows-only issue. It happens with both nightly (39.0a1) and stable (36.0) on Ubuntu 14.04 too.

Apparently the example PDF file is malformed. The 'ó' character present in the action URI of most of the links is UTF-8 encoded while the PDF standard states in section that URI needs to be encoded in 7-bit ASCII.

It appears that PDF.js assumes the URI to be WinAnsi (CP-1252) encoded thus producing the ó characters from the UTF-8 C3 B3 character.

Other viewers (Evince 3.10.3 and Chromium 40.0.2214.111) are able produce the intended links.

Is the above described behaviour intentional? If not, I should be able to provide a patch to fix this.
I have created a pull request to fix this issue.

It seems that by now, the linked PDF has been changed and most of the problematic links have been fixed (those with the word Licitación causing trouble).
There are still a few others remaining, e.g. Actos de la Licitación\3.2.-ANEXO 2 GUÍA DE DOCUMENTOS ANEXOS in the Abril 14 row.
Assignee: nobody → hellemar
Whiteboard: [pdfjs-c-ux][pdfjs-d-annotations][good first bug] → [pdfjs-c-ux][pdfjs-d-annotations][pdfjs-f-fixed-upstream]
Duplicate of this bug: 1023808
Depends on: 1168547
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 41
Depends on: 1308362
You need to log in before you can comment on or make changes to this bug.