Closed Bug 344620 Opened 18 years ago Closed 12 years ago

[RFE] Native (inline) PDF rendering

Categories

(Core :: General, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 714712

People

(Reporter: reuben.m, Unassigned)

References

Details

(Whiteboard: [parity-webkit] [parity-safari] [parity-chrome])

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.1) Gecko/20060211 Firefox/1.5.0.1
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.1) Gecko/20060211 Firefox/1.5.0.1

Request to add native PDF rendering in Firefox (and/or other mozilla based apps) using the poppler library.

Reproducible: Always
I see no mention of license(s) this is offered under.  I would think it would be offered under some type of free software license.  

Usually Mozilla strives for a tri-license MPL/GPL/LGPL.  But Cairo only says it's offered under LGPL and MPL.

And would this work on Mac? Mac OS X Tiger does have its own native PDF rendering.  But mozilla still supports older versions of Mac OS X so that won't work.  
I don't think license is a problem. KDE and Gnome both use it.

If implemented this would probably be a statically linked lib. It should compile and work just fine on any system. (some modifications may need to be made for handeling fonts on some platforms...) 

Cairo is not an issue either. Firefox already uses it for rendering SVG and Canvas. Futhermore, Gecko 1.9 is supposed to use Cairo to render everythng.
See bug 162659, which seems sort of similar.

When we end up with a PDF output option (likely to happen for Gecko 1.9/FF 3.0), it will be through cairo.  Nobody is interested in implementing support for any other rendering library.

Also, poppler is GPL; all code in the Mozilla codebase is required to be licensed under the MPL/LGPL/GPL tri-license (see http://www.mozilla.org/MPL/license-policy.html).
Status: UNCONFIRMED → RESOLVED
Closed: 18 years ago
Resolution: --- → WONTFIX
By through cairo, I mean directly calling into cairo.  I didn't realize poppler wrapped cairo (although it doesn't matter for my point).
<i>Also, poppler is GPL; all code in the Mozilla codebase is required to be
licensed under the MPL/LGPL/GPL tri-license (see
http://www.mozilla.org/MPL/license-policy.html).</i>

I don't think it would be hard to convince the people holding the copyrights to add the LGPL licensing.

It is under GPL because it is derived from the Xpdf application which is under GPL. I don't think the folks developing Xpdf ever considered any need to license it under LGPL because it is an app, not a library.

I'd be willing to bet that the copyright holders for Xpdf would give the blessing to LGPL. Read this page and you'll see that the guy who wrote it seems pretty open to licensing: http://www.foolabs.com/xpdf/about.html

If Xpdf could be licensed as LGPL, then the developers of Poppler would do the same. (I think they would be thrilled at the idea of a project as big as mozilla using poppler. That means more people contributing fixes to their code base)
Sorry, I wasn't really making myself clear.  The infrastructure for making PDFs through cairo already exists in the Mozilla codebase.  Actually, PDF output is already mostly working (although not used at the moment).  There isn't any need for more PDF code.
I wan't talking about PDF output. This is more concerned with the viewing of PDF files within the browser. Adobe's PDF viewer plugin doesn't work on all platforms supported by firefox, and so many documents on the internet are offered in PDF format that I thought it would be advantageous to just offer the ability to open the PDF files directly for viewing within the browser without need for a plugin. Sorry I wasn't clean what I meant by "rendering PDFs".

Unless I misunderstood and this feature is included in the PDF export functionality you are referring to.
Oh, sorry; my bad.  I feel silly now.  It's a legitimate request, although the relicensing seems unlikely (see http://www.glyphandcog.com/).
Status: RESOLVED → UNCONFIRMED
Resolution: WONTFIX → ---
Well, I've written the author of Xpdf to see if LGPL would be possible. If I hear anything back positive about LGPL then I will post it here.

If it turns out that licenseing won't happen, I may try to petition the poppler project to develop an open-source PDF viewer plugin for firefox. I think that would achieve the same purpose.
Ok, I've heard back from the copyright holder of Xpdf and he's not going to release it under the LGPL anytime soon since it would undermine his business model.

So I guess this request still stands, but without the poppler library, which would probably mean building a PDF rendering library from scratch. (fun) However, it might be possible to reuse some of the code resulting from bug 162659
confirming

changed summary from "Native PDF rendering via poppler" to "[RFE] Native (inline) PDF rendering (via poppler?)"
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Native PDF rendering via poppler → [RFE] Native (inline) PDF rendering (via poppler?)
Blocks: 91559
what about using an alternative library such as podofo (http://podofo.sourceforge.net/) or mupdf (http://ccxvii.net/apparition/)?
the first is LGPL, the second GPL
Blocks: 455917
No longer blocks: 91559
I'd love to have native inline PDF handling in FF, still waiting as of 3.6 :)
This is such an old feature request that shows up randomly in my email every once in a long while.

A lot has changed since this feature request was first filed, and a lot has changed since then.

I thought I might suggest, for anybody interested, that perhaps this feature can be implemented by translating PDF files into canvas elements. Work similar to this has been done elsewhere. Just recently there was released (close source unfortunately) a free plugin for Adobe Illustrator that translates .ai files into canvas html. (For those who are not aware, .ai files are simply pdf 1.4 files with a different file extension) So it is possible to do. Furthermore, it would allow using javascript within PDFs for interactive features like forms. Perhaps there could even be a custom namespace for it so that the PDF javascript can run directly without any modifications.

The big advantage to doing it this way would be that you would not have to write a separate rendering engine.
This has come up a couple of times, and roc has suggested supporting PDF with a viewer implemented in JS.

JS means rendering using Canvas 2D or using SVG. SVG is a more natural fit, since it's retained mode, so it would be feasible to convert PDF into an SVG DOM and then use the existing SVG repaint machinery for repainting.

Here's a quick brain dump of the most obvious platform extensions (probably not exposed to random Web-originating JS) that would be appropriate to make a conversion into SVG feasible:
 * Ability to render text by giving font glyph indeces (rather than giving Unicode code points that undergo all kinds of transformations that result in glyph indeces).
 * Glyph index-based text should have invisible Unicode text alternatives for accessibility, clipboard export and search.
 * Bitmap image support for TIFFs incl. LZW and CCITT group 3 and 4. (Note that it probably doesn't make sense to support JPEG 2000 or JBIG.)
 * Native code-backed deflate and LZW decompressors callable from JS.
 * Support for CMYK JPEGs (do we have this already?)
SVG is more heavyweight and less flexible in some ways. A canvas solution might be better. It's hard to tell without prototyping.
... then again, SVG would work much better with selection, find, accessibility, etc.
Can't we simply generate an HTML tree from the PDF file and show it?
(In reply to comment #22)
> Can't we simply generate an HTML tree from the PDF file and show it?

The level of abstraction that PDF encodes doesn't match the level of abstraction of HTML. Of things Gecko already supports, SVG is closer to encoding what PDF encodes except SVG has a higher level of abstraction for representing text, which is likely a problem in the absence of lower-level extensions (mentioned in comment 18).
Summary: [RFE] Native (inline) PDF rendering (via poppler?) → [RFE] Native (inline) PDF rendering
Product: Firefox → Core
QA Contact: general → general
Version: unspecified → Trunk
Attached file PoC
(In reply to comment #23)
> (In reply to comment #22)
> > Can't we simply generate an HTML tree from the PDF file and show it?
> 
> The level of abstraction that PDF encodes doesn't match the level of
> abstraction of HTML. Of things Gecko already supports, SVG is closer to
> encoding what PDF encodes except SVG has a higher level of abstraction for
> representing text, which is likely a problem in the absence of lower-level
> extensions (mentioned in comment 18).

What do you mean exactly by the level of abstraction of PDF doesn't match the one of HTML?
The attachment I've submitted is a PoC extension (it decodes the PDF only in a few cases) which read a PDF in JS and render it's content as HTML and it seems to match well. this will allow extension's author to read/manipulate the PDF as if and could be a great improvement over SVG or Canvas.

If you want to try the extension you have to:
 * untar it somewhere
 * shut down FF
 * add a file called js-reader@vingtetun.org in your extension folder and enter the path to the extension in it (with the trailing / depending on your system)
 * restart FF
 * go to the url: chrome://js-reader/content/main.html?PATH_TO_THE_readme.pdf_FILE_FROM_THE_EXTENSION
 

So if you install the extension into /home/name/js-reader you have to go to chrome://js-reader/content/main.html?/home/name/js-reader/chrome/content/readme.pdf (Sorry for the complicated way to install/view the extension rendering)

You should see a (dirty) HTML page with the PDF rendered into it.
I would like to understand why HTML is not a good solution for that?
(In reply to comment #24)
> What do you mean exactly by the level of abstraction of PDF doesn't match the
> one of HTML?

PDF encodes vector shapes using the PostScript imaging model plus transparency & Porter–Duff. PDF encodes rendered text by identifying glyphs and their positioning. (There's an optional mechanism for encoding a reverse mapping from the glyphs to Unicode for accessibility, search and clipboard export. There's also an optional mechanism for encoding mild semantics.)

HTML encodes text as Unicode code points and mild semantics. It doesn't encode vector shapes without the help of SVG. (Canvas doesn't encode any shapes as such, although a program can draw stuff onto it.)
Whiteboard: [parity-webkit] [parity-safari] [parity-chrome]
(I was looking for something else and happened across this bug.  roc, why didn't you point me at this when I was in NZ? ;) )

Probably worth mentioning is https://github.com/andreasgal/pdf.js, which is a pure-HTML5 PDF renderer that Mozilla folks and others are hacking on.
This is solved by Bug 714712
Status: NEW → RESOLVED
Closed: 18 years ago12 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: