1407238 - Cli argument for print page or generate pdf command in Firefox headless mode

Reporter

Description

•

8 years ago

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0 Build ID: 20170828095647 Steps to reproduce: Please add command line argument for print page (or export to PDF) in Firefox headless mode, like 'screenshot' function. Printing and generating pdf are more useful that pixelized screenshot.

Murz

Reporter

Comment 1

•

8 years ago

Google Chrome already have command for generate PDF: https://developers.google.com/web/updates/2017/04/headless-chrome chrome --headless --disable-gpu --print-to-pdf https://bugzilla.mozilla.org

Murz

Reporter

Comment 2

•

8 years ago

Here https://addons.mozilla.org/en-US/firefox/addon/cmdlnprint/ is addon for printing in Firefox from command line, but it isn't work on fresh Firefox versions.

Kohei Yoshino [:kohei]

Updated

•

8 years ago

Component: Untriaged → Headless

Brendan Dahl [:bdahl]

Updated

•

8 years ago

Priority: -- → P3

Richard Neill

Comment 3

•

8 years ago

Suggestion: this should also use "@media print" for the CSS rules. (That's what all the other tools: chromium, wkhtmltopdf, phantomjs, slimerjs do]

nemo

Comment 4

•

7 years ago

Oooh this would definitely be a nice to have. The problem is that of IE, Firefox and Chrome, chrome was definitely in last place when it came to PDF generation. (IE did a bit better than Firefox when it came to layout it seemed, at least on the Windows machines we had to support but Firefox was close second) Chrome's main fail is multi-page tables with page-break-before rules inside the table to ensure sections were on same page. Or just sane rendering of thead period. It would overlap content and other absurdities. Ended up hacking in fragile multi-table fragmentation of long tables just to get something chrome could manage to print. Firefox' long-standing print fail has been iframes, but that's rarely an issue with the kind of documents I'd need to export to PDF, and is much much more easily worked around in a JS iframe remover. So, yeah, if I had a headless Firefox option it would be a lovely replacement for all the other ones mentioned here which rely on webkit.

Rachel Andrew

Comment 5

•

6 years ago

This would be really useful. I've built a little app to help me test CSS fragmentation properties, which I have up and running with Puppeteer and also wkhtmltopdf. I had hoped I'd be able to take a similar approach to that of Puppeteer with headless firefox.

sputnick1124

Comment 6

•

6 years ago

Is this going to be added at some point in the future? In the meantime I am having to do some pretty hacky things to get a PDF.

My current workaround is as follows:

firefox --screenshot page.png --window-size=1500 https://mysite.com/page.html
convert page.png -crop 1500x1766 page.tiff
convert page.tiff page.pdf

This, of course, is less-than-ideal since it is a flat image which has been split into pages and it has no real text in it.

Danny Colin [:sdk]

Comment 7

•

6 years ago

(In reply to sputnick1124 from comment #6)

Is this going to be added at some point in the future? In the meantime I am having to do some pretty hacky things to get a PDF.

It has been assigned a priority of P3 (Backlog). This means developers want to integrate the feature but there's no ETA. In other word, it's in the bucket "would be nice to have when we have the time".

This, of course, is less-than-ideal since it is a flat image which has been split into pages and it has no real text in it.

Currently, the only way to have a text pdf is to print it via File > Print (Ctrl+P) > General tab > Print to file.

I hope that's answered your question :).

jman

Comment 8

•

6 years ago

Meanwhile, we can automate Firefox using "xdotool" (available on Linux) to automate a monkey to do the job for us.

An example script to start off: https://askubuntu.com/a/612510

It's hacky, but it may solve an immediate problem.

Danny Colin [:sdk]

Comment 9

•

6 years ago

(In reply to jman from comment #8)

Meanwhile, we can automate Firefox using "xdotool" (available on Linux) to automate a monkey to do the job for us.

Oh I completely forgot about xdotool. Thanks for the suggestion :D.

nemo

Comment 10

•

5 years ago

Yeah, xdotool + xvfb has been the way to automate firefox since its existence, but proper printing support would allow plugging in firefox to existing PDF generating servers more easily, and in a cross-platform fashion.
And, yeah, would love to have that for the far-improved firefox table support. Chrome continues to mangle splitting table content across multiple print pages.

sputnick1124

Comment 11

•

5 years ago

Good tips on using xdotool! Unfortunately my usecase is part of an automated CI pipeline with no X, but I'll definitely file this away for the future.

nemo

Comment 12

•

5 years ago

If your automated CI is running on linux you can do xvfb headless no problem.
For better performance you might have to set gfx.xrender.enabled;true - I know I've had to do that for things like ssh -YC or xrdp in the past. I haven't tested that recently however with xvfb.

BMO Automation

Updated

•

3 years ago

Severity: normal → S3

BugBot [:suhaib / :marco/ :calixte]

Comment 13

•

3 years ago

The severity field for this bug is relatively low, S3. However, the bug has 11 votes.
:Amir, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(ahabibi)

BugBot (nomail) [:suhaib / :marco/ :calixte]

Comment 14

•

3 years ago

The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.

Flags: needinfo?(ahabibi)

allanforms

Comment 15

•

3 years ago

(In reply to Release mgmt bot (nomail) [:suhaib / :marco/ :calixte] from comment #14)

The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.

I really believe it is very relevant!
I'm looking for it for a very long time, relying on wkhtmltopdf now, but I'd love to use firefox rendering skills (what I need is to 'print' or render it in the user side, so maybe a javascript option could solve my problem too).

Richard Neill

Comment 16

•

3 years ago

Yes, it would definitely be good to get this fixed - because most of the other tools have now become unmaintained.

SlimerJS, PhantomJS, CasperJS are all unmaintained.
Wkhtmltopdf is very outdated, advises "Do not use wkhtmltopdf with any untrusted HTML", and chokes on many common JS functions that are part of ECMA-6.
Thanks :-)

Arne Brasseur

Comment 17

•

2 years ago

Using a headless browser to generate PDFs has been common practice for over a decade, since browsers solve the hard part of handling PDFs: declarative layout. Firefox has better paged media support than Chrome, where any improvements to CSS for paged media have been blocked on switching to a new layout engine, which only landed in recent weeks (for print at least). Firefox could be a more attractive option for this use case, but the lack of an easy CLI interface makes this needlessly hard.

mail

Comment 18

•

2 years ago

This would be a very useful addition. At the moment, to programmatically generate PDF from HTML/CSS/JS with support for paged media CSS rules, headless Chrome CLI is basically the only option, as others have noted. Doing this in Firefox with a simple CLI option would open up a lot of use cases, and provide a (possibility already better) much needed alternative for a common use case.

Richard Neill

Comment 19

•

2 years ago

With the Firefox CLI, it would be amazing if PDF-printing could always run cleanly in its own process (i.e. don't share, modify, or lock against the user's Firefox's profile, even if invoked more than once before the previous job has finished) and also please ensure that it works ok even when packaged as snap (unfortunately, snap packaging breaks everything in the (common-for-automation) situation where the user is a daemon and does not have a home-directory under /home). Finally, please make sure we can define the screensize/page-size. Thanks!

Please also consider one other use-case: we may want to trigger PDF generation, not to get the PDF, but because it makes Firefox download the URL and execute the JS within it. This is useful for testing some applications.

Btw, as a workaround at present, I have to run this, which works acceptably well (or at least did, until snap broke everything).

mkdir /tmp/firefox_tmp_home
export HOME=firefox_tmp_home
timeout 60 xvfb-run -a -s -screen 0 1280x1024x24 flock -w 50 /tmp/firefox_tmp_home firefox --headless --no-remote --window-size 1448,1024
--screenshot PDF_NAME THE_URL

The temporary home is so that it doesn't interoperate with my desktop browser.
flock is so that multiple instances of the screen-shotting script don't fight, but are forced to serialise.
The xvfb-run is needed when your script is not running under X (and doesn't somehow inherit an X-environment).
Neither the screen size nor the --window-size really work well enough to correctly define the output pdf's dimensions.
Timeout forces this to exit if it gets stuck.

HTH - Thanks.

jman

Comment 20

•

2 years ago

(In reply to Richard Neill from comment #19)

mkdir /tmp/firefox_tmp_home
export HOME=firefox_tmp_home
timeout 60 xvfb-run -a -s -screen 0 1280x1024x24 flock -w 50 /tmp/firefox_tmp_home firefox --headless --no-remote --window-size 1448,1024 --screenshot PDF_NAME THE_URL

Would it make it slightly cleaner a temporary Firefox profile directory? Example:

PROFILEDIR=$(mktemp -p ~/tmp -d tmp-fx-profile.XXXXXX.d)
timeout 60 xvfb-run firefox --profile $PROFILEDIR ... (other params)
rm -rf $PROFILEDIR

Arne Brasseur

Comment 21

•

2 years ago

Screenshotting is not a real solution, the resulting PDF won't have selectable text or clickable links.

nemo

Comment 22

•

2 years ago

Screenshotting is not a solution, however you can use xvfb to puppet the firefox pdf print dialog.
It just makes the process a little more complicated. A dedicated way without fake X sessions would undoubtedly be easier/more reliable/faster.

jmkrnet

Comment 23

•

2 years ago

Attached file FIREFOX-HTML-TO-PDF.sh — Details

Here is a simplified version of (X)HTML->PDF conversion from my BASH script for converting formats that uses FIREFOX via XDOTOOL. The script uses a directory you provide as first CLI argument and recursively converts all (X)HTML files to PDF files. Note that PDF files are not saved next to (X)HTML files, but are saved to the last used save directory depending on your FIREFOX configuration. Using XDOTOOL to edit the save location is probably possible, but I preferred to keep the script simple instead. The script uses EXO-OPEN to start FIREFOX, but you can easily adapt the launch command if you do not use XFCE.

If CLI option for printing is implemented by FIREFOX I would appreciate it as it would greatly simplify the (X)HTML->PDF conversion.

mathis.gauthey

Comment 24

•

2 years ago

I used Puppeteer to make it work. I'd recommand the following code as you need to use page.setContent and not page.goto to make it work using firefox product on Puppeteer.

const puppeteer = require("puppeteer");
const path = require("path");
const fs = require("fs");
const filePath = path.resolve(__dirname, "index.html");

(async () => {
  try {
    const puppeteerVersion = require('puppeteer/package.json').version;
    console.log(`Using Puppeteer version ${puppeteerVersion}`);
    console.log(`This is the path: ${filePath}`)

    const browser = await puppeteer.launch({
      product: 'firefox',
      headless: true, // Use true to run headless, not 'new'
      dumpio: false // Error logging
    });
    const page = await browser.newPage();
    var contentHtml = fs.readFileSync(`${filePath}`, 'utf8');
    await page.setContent(contentHtml);
    // await page.goto(`file:${filePath}`);
    await page.pdf({
      path: "output.pdf",
      format: "A4",
      displayHeaderFooter: false,
      margin: { top: 0, right: 0, bottom: 0, left: 0 },
      // preferCSSPageSize: true,
      // printBackground: false, // Change to true if you want to include background
    });
    await browser.close();
    console.log("✅ PDF built");
  } catch (error) {
    console.error("❌ Error building PDF:", error);
    process.exit(1);
  }
})();

I'm using WSL2 Ubuntu latest version.

Distributor ID: Ubuntu
Description:    Ubuntu 22.04.2 LTS
Release:        22.04
Codename:       jammy

For using my exact setup :

Install WSL by using wsl --install for having the last stable ubuntu version
Used curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.5/install.sh | bash to install nvm
Used nvm install --lts to install npm
Installed resume-cli using npm install -g resume-cli
Cloning my resume repo using git
Installed puppeteer by using PUPPETEER_PRODUCT=firefox npm install puppeteer

nemo

Comment 25

•

5 months ago

•

Edited

Another reason this is useful (besides firefox doing better generation at times - we encountered yet another chrome print to pdf generating unselectable unusable tables in PDF editors today which reminded me of this) is that chrome also has a data uri length limit that Firefox does not.
If Firefox had the print to pdf option on the commandline like chrome did, it would allow doing headless firefox pdf generation using data URIs (no temp files) which can be convenient and cleaner if you have the ram (and a recent linux 2.6 which derives the argument length from the stack size which can be set with ulimit -s ☺ )

https://stackoverflow.com/questions/74218933/chromium-headless-pdf-generation-in-java-using-string-instead-of-temp-file-pag