Open Bug 1237990 Opened 4 years ago Updated 3 years ago

Feature: Webpage ZIP as alternative to PDF

Categories

(Core :: DOM: Serializers, enhancement)

enhancement
Not set

Tracking

()

UNCONFIRMED

People

(Reporter: craig, Unassigned)

Details

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:43.0) Gecko/20100101 Firefox/43.0
Build ID: 20151208100201

Steps to reproduce:

If you want to send a document to someone else (probably over email), then your main choices are PDF or a Microsoft Word document.

Both of these file formats are problematic (if you aren't familiar with these problems, I can follow up with a very long rant, but I'll leave that for now).


Actual results:

N/A


Expected results:

I would like to have a single file document that I can send to someone, and they can just open it in their web browser, like they can do with a PDF.

For an example, I have a system that creates reports, these need to be sent to someone for offline viewing / storage. This is currently done as a PDF file, and it does not work well on a small screen (e.g. mobile phone), or with assistive devices (e.g. screen readers).

The ".docx" file format is basically a ZIP file that contains some XML files, and anything else that's needed (e.g. images).

Likewise we could create a very simple file format, maybe with a ".hdoc" extension (as in HTML Doc).

This is a ZIP (or gzip) file that contains an "index.html", and all the necessary resources which are referenced by the HTML document.

So a developer could create the document on their computer, all within a single folder (during creation), and that is zip'ed up, and the extension changed.

Alternatively a non-developer could open a web page and go to "File > Save Page As" where this would be the default format (the current "Webpage, Complete" option uses a separate folder that is often lost)... or maybe their word processor could save to this format for them.

Once it is in this ZIP file, the browser shouldn't allow the document to be easily changed (normally the point is to have a non-easily-editable document for reading).

Then from a security point of view, when the browser is loading one of these files, it can then apply some additional restrictions:

- Cannot access files outside of the zip (e.g. /etc/passwd)
- Has its own local storage / cookie jar / etc.
- Cannot connect to another host (maybe).
- Cannot run JavaScript (yeh, I know, dream on).
- The file could include a password to restrict access to its contents.

This is why we can't currently just send a ".html" file to someone with all the resources being included inline (at the very least, anti-virus will probably bin it).

--------------------------------------------------

It looks like I'm not the only one who has thought of this:

http://stackoverflow.com/questions/260058/whats-the-best-file-format-for-saving-complete-web-pages-images-etc-in-a

http://hdoc.crzt.fr/www/co/hdoc.html

https://chrome.google.com/webstore/detail/pagearchiver/ihkkeoeinpbomhnpkmmkpggkaefincbn?hl=en
FYI: bug 40873.
Severity: normal → enhancement
Product: Firefox → Core
@YF: Maybe, I must admit I did forget to mention MHTML, which is based on the email format rfc2557, but...

1) It's not easy to create from a development point of view (special encoding).

2) I don't believe it applies any restrictions to the HTML/JS (e.g. resources it can load on disk).

3) Does not compress the content.

4) Cannot easily include a password without an extra step, e.g. putting it into a ZIP.
If anyone considers that the component is not the right one, please change it to a more appropriate one.
Component: Untriaged → Serializers
I'm currently keeping some extra notes at:

https://github.com/craigfrancis/wdoc

Pull requests / comments welcome if you have any thoughts/ideas.
You need to log in before you can comment on or make changes to this bug.