Closed Bug 1031612 Opened 10 years ago Closed 10 years ago

In PDF Viewer, the buggy XMP title "Untitled" overrides the document info title

Categories

(Firefox :: PDF Viewer, defect, P4)

30 Branch
x86_64
Linux
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: vincent-moz, Assigned: sang.mercado)

Details

(Whiteboard: [pdfjs-c-ux][good first bug][pdfjs-f-fixed-upstream] https://github.com/mozilla/pdf.js/pull/5287)

Attachments

(1 file)

43 bytes, text/x-github-pull-request
yury
: review+
Details | Review
When I open some PDF files in the builtin PDF viewer, their title is not always shown in the window name (titlebar). For instance:

  https://www.vinc17.net/research/slides/gdt2014-04.pdf

is shown as "Untitled", though both evince and pdfinfo say that the title is:

  Introduction to the GNU MPFR Library

Some PDF files do not have this problem, e.g.

  https://www.vinc17.net/research/slides/tamadi2013-10.pdf

It seems that the problem occurs on files for which I used ps2pdf then re-added the metadata with pdftk.
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0

I opened the pdf using latest Nightly 2014-07-27, opened the Inspector and search for title:
<title>Untitled - gdt2014-04.pdf</title>

I see the same title in the titlebar so the name could came from here.
After inspecting several PDF files, it seems that the problem occurs when in the PDF file, /Title is followed by a space, e.g.:

/Title (Introduction to the GNU MPFR Library)

The working PDF file mentioned above has:

/Title(Hardest-to-Round Cases \205 Part 2)
Summary: the title of some PDF files is not shown in the window name → PDF files are regarded as Untitled when /Title is followed by a space before the opening parenthesis
The file contains XMP metadata, which overrides the original title with "Untitled" value. And that's what is used for web page/tab title. Please don't specify XML metadata if you don't wish the title be pulled from it. Not sure how you want to resolve it, but space between the /Title is not the reason. Do you want to close it as invalid or to request the removal of the XMP metadata support in the PDF viewer?
Flags: needinfo?(vincent-moz)
Priority: -- → P4
Whiteboard: [pdfjs-c-ux]
OK, updated the title according to the real problem.

That's annoying because it is Ghostscript (via ps2pdf) that automatically adds these XMP metadata in the back of the user. This is a serious bug in Ghostscript as according to http://www.pdflib.com/knowledge-base/xmp-metadata/ XMP metadata tends to be preferred to the conventional document info entries. I'll report a bug against Ghostscript, but unfortunately one already has many PDF files on the web with this "Untitled" title in the XMP metadata added by Ghostscript. IMHO, as a workaround to the Ghostscript bug, if there is a document info title and an XMP title "Untitled" (which will never be a correct title in practice), PDF Viewer should prefer the document info one.
Flags: needinfo?(vincent-moz)
Summary: PDF files are regarded as Untitled when /Title is followed by a space before the opening parenthesis → In PDF Viewer, the buggy XMP title "Untitled" overrides the document info title
The PDF.js logic can be adjusted to ignore "Untitled" entry. See https://github.com/mozilla/pdf.js/blob/master/web/viewer.js#L1153 .
Whiteboard: [pdfjs-c-ux] → [pdfjs-c-ux][good first bug]
I would like to work on this...
Hi Sushrut! You can submit a pull request at https://github.com/mozilla/pdf.js (paste the pull request link here).
Sushrut, are you still working on this bug?

If not, Can I be assigned to this bug?
Flags: needinfo?(ydelendik)
This bug is still unassigned (In reply to Yury Delendik (:yury) from comment #7)
> Hi Sushrut! You can submit a pull request at
> https://github.com/mozilla/pdf.js (paste the pull request link here).

Hey, after reading the bug info and comments I'd really like to work on this as my first bug. Not sure however if a pull request was submitted, but since I don't see one on the github or a link posted here I'm going to assume it's still in limbo. Going to grab the pdf.js source and start testing :)
(In reply to Sang Mercado from comment #8)
> Sushrut, are you still working on this bug?
> 
> If not, Can I be assigned to this bug?

So far nobody submitted a pull request. I estimate the patch be additional 3-5 lines of code.

Once a github pull request is opened, it will referred from here (to avoid future duplicate work).
Flags: needinfo?(ydelendik)
Attached file Requesting code review
Hopefully I did this right
Attachment #8487626 - Flags: review?(mak77)
Attachment #8487626 - Flags: review?(mak77) → review?(ydelendik)
Whiteboard: [pdfjs-c-ux][good first bug] → [pdfjs-c-ux][good first bug] https://github.com/mozilla/pdf.js/pull/5287
Assignee: nobody → sang.mercado
Whiteboard: [pdfjs-c-ux][good first bug] https://github.com/mozilla/pdf.js/pull/5287 → [pdfjs-c-ux][good first bug][pdfjs-f-fixed-upstream] https://github.com/mozilla/pdf.js/pull/5287
Attachment #8487626 - Flags: review?(ydelendik) → review+
The pull request has been merge, so is this bug fixed?
Looks good on current nightly
Status: UNCONFIRMED → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
I confirm that the bug no longer occurs in nightly.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: