Open Bug 1426775 Opened 6 years ago Updated 2 years ago

Saving reader mode pages as "Web page, html" omits all useful text

Categories

(Toolkit :: Reader Mode, defect, P3)

52 Branch
defect

Tracking

()

People

(Reporter: erwinm, Unassigned)

References

Details

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Firefox/52.0
Build ID: 20171206101620

Steps to reproduce:

I find it helpful to be able to read pages in Reader View, to read them offline or on other devices, and to read them as epub or mobi bundles containing several articles in one file.


Actual results:

Of the built-in export modes:

"Web page, complete" can export a readable page, but involves a lot of files.

"Web page, html" only can only export a blank page.

"Text files" can export the text, but loses paragraph breaks.

"All files" can only export a blank page.

Printing as pdf can export the whole page, in one file, but the print will be too small if another device's screen is too small, so it may require further conversion.

Of other export tools:

GrabMyBooks can export Reader View pages to epub, but it uses Xul, and doesn't work with 57+.

EPUBPress and others can't access Reader View pages, or export them, because Web Extensions aren't allowed access, and these are the only types which work on 57+.


Expected results:

There's an ongoing discussion of whether Web Extensions should be allowed to access Reader View pages; if they can, then extensions can solve the export problem:

https://bugzilla.mozilla.org/show_bug.cgi?id=1390035

https://bugzilla.mozilla.org/show_bug.cgi?id=1371786

If they still can't, then maybe Firefox itself should include adequate export tools.
You might want to take a look at pocket...
Flags: needinfo?(erwinm)
I could never get Pocket to work. What info did you need?
Flags: needinfo?(erwinm)
At the time, using Pocket required programming "recipes" for Calibre to import from Pocket. I could never get the recipes to work, because Pocket would only export pieces it decided were articles, and it decided that the ones I was trying to export weren't worthy. As of now, it looks like using Pocket still requires programming "recipes"-- I can't log files in Calibre unless I have "recipes" to import from Pocket, and I can't transfer files to my e-readers unless I transfer through Calibre or through other third-party tools.
Component: Untriaged → Reader Mode
Product: Firefox → Toolkit
(In reply to MarjaE from comment #0)
> "Web page, complete" can export a readable page, but involves a lot of files.

I'm not really sure what "a lot of files" means here. What do we export and why is that surprising / problematic?

> "Web page, html" only can only export a blank page.

This seems buggy, so I'll morph for this, I guess. Ideally we should fix this.

> "Text files" can export the text, but loses paragraph breaks.

This seems buggy but a separate bug unrelated to reader mode. Though it might depend on the specific article / page you're rendering in reader mode.

> "All files" can only export a blank page.

This will use one of the other modes, presumably (based on the other reports) "Web page, HTML".

> There's an ongoing discussion of whether Web Extensions should be allowed to
> access Reader View pages; if they can, then extensions can solve the export
> problem:
> 
> https://bugzilla.mozilla.org/show_bug.cgi?id=1390035
> 
> https://bugzilla.mozilla.org/show_bug.cgi?id=1371786
> 
> If they still can't, then maybe Firefox itself should include adequate
> export tools.

bug 1371786 covers webextension access to reader mode. It's blocked on security improvements that we have no resources to execute on in the foreseeable future.

"Adequate export tools" specific to reader mode isn't something that we will work on. People are moving more towards solutions that involve databases or organized storage, either locally or in the cloud, rather than saving individual pages as HTML. It's not likely to be useful to create more ways of saving things specifically for reader mode.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P3
Summary: Buggy Export from Reader View → Saving reader mode pages as "Web page, html" omits all useful text
> People are moving more towards solutions that involve databases or organized storage, either locally or in the cloud, rather than saving individual pages as HTML. It's not likely to be useful to create more ways of saving things specifically for reader mode.

I don't understand. I know it's nice to be able to save pages as epubs, but 57 breaks a lot of the older tools, and doesn't allw the newer ones to access reader mode. So Firefox is moving away from allowing users to save pages in that particular organized way.

(In reply to MarjaE from comment #0)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0)
Gecko/20100101 Firefox/52.0
Build ID: 20171206101620

Steps to reproduce:

I find it helpful to be able to read pages in Reader View, to read them
offline or on other devices, and to read them as epub or mobi bundles
containing several articles in one file.

Actual results:

Of the built-in export modes:

"Web page, complete" can export a readable page, but involves a lot of files.

"Web page, html" only can only export a blank page.

"Text files" can export the text, but loses paragraph breaks.

"All files" can only export a blank page.

Printing as pdf can export the whole page, in one file, but the print will
be too small if another device's screen is too small, so it may require
further conversion.

Of other export tools:

GrabMyBooks can export Reader View pages to epub, but it uses Xul, and
doesn't work with 57+.

EPUBPress and others can't access Reader View pages, or export them, because
Web Extensions aren't allowed access, and these are the only types which
work on 57+.

Expected results:

There's an ongoing discussion of whether Web Extensions should be allowed to
access Reader View pages; if they can, then extensions can solve the export
problem:

https://bugzilla.mozilla.org/show_bug.cgi?id=1390035

https://bugzilla.mozilla.org/show_bug.cgi?id=1371786

If they still can't, then maybe Firefox itself should include adequate
export tools.

hello.
did you test with older versions than 52?
i tested firefox 48, 49 and 50 and i can confirm the issue in all of them!
me too, i wish to activate reader mode and can save my files as html only.

(In reply to :Gijs (he/him) from comment #4)

(In reply to MarjaE from comment #0)

"Web page, complete" can export a readable page, but involves a lot of files.

I'm not really sure what "a lot of files" means here. What do we export and
why is that surprising / problematic?

"Web page, html" only can only export a blank page.

This seems buggy, so I'll morph for this, I guess. Ideally we should fix
this.

"Text files" can export the text, but loses paragraph breaks.

This seems buggy but a separate bug unrelated to reader mode. Though it
might depend on the specific article / page you're rendering in reader mode.

"All files" can only export a blank page.

This will use one of the other modes, presumably (based on the other
reports) "Web page, HTML".

There's an ongoing discussion of whether Web Extensions should be allowed to
access Reader View pages; if they can, then extensions can solve the export
problem:

https://bugzilla.mozilla.org/show_bug.cgi?id=1390035

https://bugzilla.mozilla.org/show_bug.cgi?id=1371786

If they still can't, then maybe Firefox itself should include adequate
export tools.

bug 1371786 covers webextension access to reader mode. It's blocked on
security improvements that we have no resources to execute on in the
foreseeable future.

"Adequate export tools" specific to reader mode isn't something that we will
work on. People are moving more towards solutions that involve databases or
organized storage, either locally or in the cloud, rather than saving
individual pages as HTML. It's not likely to be useful to create more ways
of saving things specifically for reader mode.

"Web page, html" only can only export a blank page.

i understand it!
it means if we save complete web page, we have multiple files in a folder, including css, javascript and many mores.
but html only saves only file for us and we can avoide multiple files!

See Also: → 1651728

Moot, since Reader View is no longer accessible.

Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.