Provide a way to generate a pdf string for some HTML code in a worker
Categories
(Core :: DOM: Core & HTML, enhancement)
Tracking
()
People
(Reporter: calixte, Unassigned)
References
Details
Context:
In pdf.js we render forms in using some html elements and let the user file them.
When the pdf is printed or saved, we must generate a pdf representation of what the user filed.
With latin alphabets it's pretty easy (when there are no diacritics) but with arabic for example the glyph layout isn't trivial at all, because of rtl instead of ltr, ligatures, diacritics, ...
The font associated with the text field in the pdf doesn't always contains all the glyphs the user used in the <input>.
We don't want to implement a text layout engine in pdf.js because it's hard to do and because we already have one in Firefox.
Hence my idea is to provide a function available in workers (because the saving/printing stuff in pdf.js happens in a Worker), which could take some html, for example something like:
<div style="width: 20px; height: 100px>
Hello World مرحبا بالعالم
</div>
convert it to a pdf and then return this pdf as byte array, a blob, whatever.
We could then extract the appearance stream from this pdf and the subsetted fonts to inject them in the pdf to print/save it.
Reporter | ||
Comment 1•2 years ago
|
||
:jfkthame, what do you think about that ? or could you imagine an other solution ?
Comment 2•2 years ago
|
||
Yeah, in general adding text to a PDF is problematic because the embedded fonts might not support the right characters, or might be using some arbitrary encoding, so you can't rely on using the existing font resources and providing new/additional text.
In principle, I think you should be able to put the HTML fragment you want into a document with whatever styling it needs, and then use the print-to-PDF support to generate a PDF version of it ... e.g. compare the Thunderbird patch-in-progress in bug 1751066, which includes adding a tabs.getAsPDF()
API to create a PDF representation of the contents of a tab (as I understand it).
I don't know if the PrintSettings stuff and print API can be used from a Worker, though...
Alternatively, instead of directly trying to create and inject a PDF representation of the user-filled content, what about just displaying it using HTML elements placed on top of the pdf.js canvas? (Like the invisible text layer used to support search, select, etc., but this would be visible.) Then simply printing the document should work automatically, as it'll paint the canvas and the visible HTML elements on top of it. So in effect the printing code will take care of merging the HTML overlays into the output, you don't need to manipulate the PDF directly.
Description
•