665750 - Export a subset of pages for offline reading

Reporter

Description

•

14 years ago

Export a subset of pages (user-selected or structure-based) for offline reading (e.g., PDF or zipped HTML). Might be able to use code from Objavi2 (http://objavi.flossmanuals.net).

John Karahalis [:openjck]

Reporter

Updated

•

14 years ago

Version: unspecified → Kuma

Luke Crouch [:groovecoder]

Updated

•

14 years ago

Assignee: lcrouch → nobody

John Karahalis [:openjck]

Reporter

Updated

•

14 years ago

Target Milestone: 1.0 alpha → ---

Jay Patel [:jay]

Updated

•

14 years ago

Priority: -- → P2

John Karahalis [:openjck]

Reporter

Comment 1

•

14 years ago

This is a pretty popular request on UserVoice. Only two individual requests were made, but these requests have some of the highest vote totals out of everything submitted. Please see UserVoice for specifics on this request: http://mdn.uservoice.com/forums/51389-mdn-website-feedback-http-developer-mozilla-org/suggestions/1390125-mdn-documentation-available-for-offline-reading?ref=title http://mdn.uservoice.com/forums/51389-mdn-website-feedback-http-developer-mozilla-org/suggestions/725068-downloadable-snapshot-zip-archive-of-some-sectio?ref=title

John Karahalis [:openjck]

Reporter

Updated

•

14 years ago

Summary: Kuma: Export a subset of pages for offline reading → Export a subset of pages for offline reading

Whiteboard: [u: user] [c: wiki] → u=user c=wiki p=

John Karahalis [:openjck]

Reporter

Updated

•

13 years ago

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → DUPLICATE

Jay Patel [:jay]

Comment 3

•

13 years ago

reopening. this bug is actually quite different than the other one. in this bug, we want to make it possible for a user to select a document or a set of documents (perhaps all the docs that show up in their search results or filtered view) and export them to .pdf or .html for viewing offline. the other bug was more about getting a raw dump of docs data so someone could run a mirror of the wiki pages somewhere else (online).

Status: RESOLVED → REOPENED

Resolution: DUPLICATE → ---

John Karahalis [:openjck]

Reporter

Comment 4

•

13 years ago

Good catch. There's some overlap: * "I often want to work offline (notebook, in a cafe)" * "I'd like a dump of MDC (of certain sections or all) to store on my computer, so that I have the documentation locally" * etc. If nothing else, it might be helpful to keep this person's thoughts in mind.

Ben Bucksch (:BenB)

Updated

•

13 years ago

Status: REOPENED → NEW

John Karahalis [:openjck]

Reporter

Comment 5

•

13 years ago

Sheppy talked about this in his user interview. Sheppy explained that users should be able to take an entire subsection of the site and dump it out as a PDF. He pointed to "The JavaScript Reference" as being one example of a subsection. He explained that users would be interested in this because they could have the content locally (for example on their laptops and iPads), which might be useful if they are flying. Sheppy: Two additional questions on this. 1. Can you provide a few more examples of subsections? 2. Can you provide a few more use cases for this? Other than flying, when might MDN readers want this feature?

Whiteboard: u=user c=wiki p= → [user-interview] u=user c=wiki p=

Ben Bucksch (:BenB)

Comment 6

•

13 years ago

1) when they need it all the time and it's too slow to load it from the server all the time. Pageload of 1s adds up, if you do that a lot. 2) when the server is down 3) when their internet connection doesn't work 4) when they are on the beach or sitting in the park (and don't have 3G flatrate) and approximately 245 other situations.

Janet Swisher

Comment 7

•

13 years ago

Additional comment on UserVoice: https://mdn.uservoice.com/admin/forums/51389-mdn-website-feedback-http-developer-mozilla-org/suggestions/725068-downloadable-snapshot-zip-archive-of-some-sectio?tracking_code=703bd1aca4b2722a9d0032adbcf216f4#/comments Claus Reinke How about adding HTML offline cache manifests to the top-level pages (Javascript, DOM, HTML, ..)? on Oct 10, 2011

Luke Crouch [:groovecoder]

Comment 8

•

13 years ago

cache manifests is a good way to do it, and it looks like MindTouch has a service for it: http://www.mindtouch.com/blog/2010/08/30/html-export-import/

ali spivak

Updated

•

13 years ago

Priority: P2 → P3

Andreas Eibach

Comment 11

•

13 years ago

Whoops, Janet's URL requires login. This one lets you read the article without logging in: http://mdn.uservoice.com/forums/51389-mdn-website-feedback-http-developer-mozilla-org/suggestions/725068-downloadable-snapshot-zip-archive-of-some-sectio (Mind that "admin" in the URL! ;))

David Bruant

Updated

•

13 years ago

Blocks: 756266

Nobody; OK to take it and work on it

Assignee

Updated

•

12 years ago

Version: Kuma → unspecified

Nobody; OK to take it and work on it

Assignee

Updated

•

12 years ago

Component: Website → Landing pages

John Karahalis [:openjck]

Reporter

Comment 12

•

12 years ago

This feature is still a little ways down from top priority, but I wanted to capture this idea for if/when we do take it on. In bug 809514, Sheppy mentioned that it would be nice to generate an offline collection of all pages with a certain tag.

John Karahalis [:openjck]

Reporter

Updated

•

12 years ago

No longer blocks: 756266

Eric Shepherd [:sheppy]

Comment 13

•

12 years ago

Ideally we would offer a few ways to assemble this content: (1) "This page and all of its subpages" (2) "All pages that match this set of tags" And ideally we would offer a few kinds of output: (1) HTML for reading in-browser (2) PDF (3) ePub I've listed those in priority order, I believe. I personally would love ePub since that'd work best in my e-reader of choice, but I know that it's the least broadly usable. If we can find a system that supports exporting to a variety of formats, that'd be fantastic.

Luke Crouch [:groovecoder]

Comment 14

•

12 years ago

The SUMO team has an intern working on this for the Firefox support docs. We should see if we can leverage their work.

John Karahalis [:openjck]

Reporter

Updated

•

12 years ago

Whiteboard: [user-interview] u=user c=wiki p= → [user-interview]

Luke Crouch [:groovecoder]

Updated

•

12 years ago

Depends on: 883967

Eric Shepherd [:sheppy]

Comment 15

•

12 years ago

How long as the little hard to see link about reading docs offline been there on MDN? I just noticed it for the first time. :)

David Walsh :davidwalsh

Comment 16

•

12 years ago

Was just pushed today -- it isn't meant to get overwhelming amounts of attention, but it is in a prominent position.

Luke Crouch [:groovecoder]

Comment 17

•

11 years ago

Per https://bugzilla.mozilla.org/show_bug.cgi?id=883967#c10 this is a low-priority bug.

John Karahalis [:openjck]

Reporter

Comment 18

•

11 years ago

After looking through the data (see 1) we have decided not to build this feature into MDN. Other options (like DocHub and Dash) exist for those still interested in this -- please contact me if you need any assistance setting those up. [1] https://bugzilla.mozilla.org/show_bug.cgi?id=883967#c10

Status: NEW → RESOLVED

Closed: 13 years ago → 11 years ago

Resolution: --- → WONTFIX

Ben Bucksch (:BenB)

Comment 19

•

11 years ago

I looked at DocHub and Dash, and both seem to provide certain API subsets only, is that correct? Furthermore, I need to install local software, correct? Does Dash even exist for Linux? It looks Mac only. DocHub seems to provide only CSS and HTML APIs, but not any of the rest of MDC. I'm not interested in any third party services. Sorry, but my needs are not filled. I need a full copy of (one language version of) the site, same information and no parts cut out, in a way that I use offline, without extra software to be installed, only the browser to use it. Essentially, I need a tarballs with HTML files that are browsable and self-containing. From bug 883967 comment 10: > * 0 new comments on bug 665750 > this feature is not in high demand. Safe to ignore bug 665750 I find that quite offensive. We are not supposed to make noise on bugzilla, but just wait. I've been patiently *waiting* for years. And now that's counted against us. I've put quite some work into this bug 561470 comment 31 (which is essentially this bug), manually set up a mirror at http://mdn.beonex.com , which cost a lot of time to set up, but gets more and more outdated due to lack of a feed from you. I need MDN for my work. I want response times in the ms range and get 2-3s per page load. Sometimes, it's down. Sometimes, I don't have Internet. MDC is crucial. Both this bug here and bug 561470 are important for classical open source values: You cannot be an open source project, ask the community to contribute to the documentation, and then keep all these docs for yourself, usable only on your website. The information must be free and copyable for everybody, both in source code form (bug 561470) and in resulting form (this bug). This is highly important for very basic open source reasons and preservation of information.

Status: RESOLVED → REOPENED

Resolution: WONTFIX → ---

Francois Guerraz

Comment 20

•

11 years ago

(In reply to Ben Bucksch (:BenB) from comment #19) I couldn't agree more with the previous comment. I would just add that *there are* places on the planet with very bad or expensive internet connection, they're usually both bad and expensive btw. I voted for this bug when I was working in a research station in Antarctica. I set up local mirrors for all the documentation and software we were using there; all but one! PHP, python, django, etc. no problem whatsoever. MDN, despite being hugely valuable, was a nightmare to mirror and I gave up. And think about all the developing countries! And people travelling, etc. Ignoring this bug is really a disgrace.

Luke Crouch [:groovecoder]

Comment 21

•

11 years ago

We know and appreciate how much work it takes to export MDN content. As mentioned in https://bugzilla.mozilla.org/show_bug.cgi?id=561470#c38, Mozilla WebOps provides a weekly snapshot of MDN at https://developer.mozilla.org/media/developer.mozilla.org.tar.gz via the mechanism implemented in bug 757461. Thank you for helping us with it! WONTFIX'ing this bug is inappropriate, but the reality is that 99.9% of MDN users aren't interested enough in the feature to click a link, much less to participate in defining, discussing, or prioritizing the feature. Without that collaboration, we can't devote company resources to it yet.

John Karahalis [:openjck]

Reporter

Comment 22

•

11 years ago

> I've put quite some work into this bug 561470 comment 31 (which is > essentially this bug), manually set up a mirror at http://mdn.beonex.com , > which cost a lot of time to set up, but gets more and more outdated due to > lack of a feed from you. We appreciate the work you are doing with mdn.beonex.com, and this decision is in no way meant to show anything else. We have the Kuma API for third-party projects that want to programmatically re-use our documentation. We can make improvements to the API if warranted. https://developer.mozilla.org/docs/Project:MDN/Tools/Kuma_API > Both this bug here and bug 561470 are important for classical open source > values: You cannot be an open source project, ask the community to > contribute to the documentation, and then keep all these docs for yourself, > usable only on your website. The information must be free and copyable for > everybody, both in source code form (bug 561470) and in resulting form (this > bug). This is highly important for very basic open source reasons and > preservation of information. I understand your interest in offline copies of our content, but this is not a valid criticism. Our content is licensed CC-BY-SA and we have the Kuma API as a convenience for those interested in reusing it. I understand that the API may not be ideal for your needs, but I do not agree that this puts us at odds with free software values. We all have the same goal in mind: offer a wonderful, helpful reference for web developers. Like any team, we have limited resources and need to carefully choose efforts that maximize this goal. While I understand that you would find this feature to be valuable, our research reveals that relatively few users feel the same way, and that our effort may be better spent elsewhere. I will leave this open for now, but will ask Ali (our product manager) to make the final call.

Ben Bucksch (:BenB)

Comment 23

•

11 years ago

> Mozilla WebOps provides a weekly snapshot of MDN at https://developer.mozilla.org/media/developer.mozilla.org.tar.gz I'm linking to this at https://developer.mozilla.org/en-US/docs/Project:MDN/About#Download_content , which is where the license in the footer points at.

Luke Crouch [:groovecoder]

Comment 24

•

11 years ago

Good idea. I've re-enabled the offline content notice but it sounds like Dash and dochub may not serve everyone's needs. We'll leave this bug open for when we have time and/or collaboration to create a better alternative.

Bogdan Popescu (Kapeli)

Comment 25

•

11 years ago

Dash (maybe) covers the needs for OS X users. For Windows and Linux users, I think Zeal (http://zealdocs.org/) is a suitable alternative. Zeal and Dash use the same doc format and Zeal users have access to all of the docs available in Dash, which includes MDN. Unfortunately, Zeal does not yet have a download manager, so users have to manually download the docs they want from http://kapeli.com/docset_links The DocHub project is a bit abandoned as far as I can tell, or at least the scrapers they use to fetch the docs from MDN don't work anymore (as per https://github.com/rgarcia/dochub/issues/37).

Luke Crouch [:groovecoder]

Comment 26

•

11 years ago

That's interesting. We should add Zeal to our offline content notice.

Bogdan Popescu (Kapeli)

Comment 27

•

11 years ago

In other news, I've received 76 unique visitors to http://kapeli.com/dash?ref=mdn and 102 pageviews in less than 24 hours, so I think it's safe to assume that the tracking results at https://bugzilla.mozilla.org/show_bug.cgi?id=883967#c10 are wrong.

Ben Bucksch (:BenB)

Comment 28

•

11 years ago

Bogdan, thanks for this nice list on Kapeli!

Ben Bucksch (:BenB)

Comment 29

•

11 years ago

phew... I just noticed: Within a good year, I got 2GB of access/error.log on http://mdn.beonex.com. Seems there's demand for mirrors.

vlenceleuth

Comment 30

•

11 years ago

Yo, can someone post up an offline version as pdf

John Karahalis [:openjck]

Reporter

Comment 31

•

11 years ago

(In reply to vlenceleuth from comment #30) > Yo, can someone post up an offline version as pdf Quoting Luke from comment 21. > Mozilla WebOps provides a weekly snapshot of MDN at > https://developer.mozilla.org/media/developer.mozilla.org.tar.gz via the > mechanism implemented in bug 757461. Thank you for helping us with it!

Daniel Zorro

Comment 32

•

11 years ago

MDN Web technology references like: WebGL WebApi are NOT listed in dash and zeal app. we need those too, in fact if you make a docset with all MND (reference and developers documentation) it will be nice.

Thibaut Courouble

Comment 33

•

11 years ago

DevDocs has been open sourced and allows for offline reading of most of MDN's content: https://github.com/Thibaut/devdocs

Luke Crouch [:groovecoder]

Comment 34

•

11 years ago

thibaut: very nice. Does devdocs scrape WebGL and/or WebAPI docs as Daniel suggests? Can it?

Thibaut Courouble

Comment 35

•

11 years ago

Luke: No. At the moment DevDocs only scrapes API/reference pages in /Web/HTML, /Web/CSS, /Web/JavaScript, /Web/Reference/Events, and /Web/API. It can certainly scrape more stuff but for now I've decided to skip the guides (difficult to index) and Firefox OS pages (still experimental and I'd rather separate them into their own doc set). I'm going to write documentation in the coming days on how to extend/contribute to DevDocs.

ilmostro7

Comment 36

•

11 years ago

MDN is definitely becoming "overstretched" in scope, especially now with FirefoxOS gaining more attention as well as docpages; speking of which, I think the next step should DEFINITELY be an App with some Documentation on the marketplace, even if it's not referencing MDN synchronously at all times--references could be pulled during off-peak hours, once a working/viable method presents itself

Susan Hu

Comment 37

•

11 years ago

There are too many APIs, I don't know when I can finish reading all.

merlin510dm

Comment 38

•

11 years ago

I need to talk to someone directly on the phone. Our numbers are: (909) 629-2820, (909) 343-9591, or (909) 343-(9434. Our computer systems and cellphones were first attached in early August 2013, and it has been a constant battle until a week ago. When they first hit, I battled them hard, and it cost me two desktops systems and 3 laptops. I learned that I could not beat them head on, no one can beat an entity that has the latest cutting edge equipment, software, and teams of people to work at you 24 hours a day. So, our home became a place of extreme duress, for we were under constant monitoring and surveillance in our home. We were tracked via our phones when ever we went our. They toyed with us on line, and manipulated of our computers to the point that we are unable to complete a single letter or document. Up until about 5 days ago, our computers, phones, and any devices that can connect in any possible way to the internet have been under the control of a major player in the software industry. I know that sounds crazy, but we have absolute proof, and that was what finally made them leave. When they were still in control, I begin working on the Aurora Beta Browser as a desperate attempt to create a safe haven with enough tools to fight the changes that were made in all our communication devices. (Said "entities" seem to have great interest in this Browser, and have been trying to take it apart on a daily basis.) As crazy as it sounds, we have proof of all that has transpired. We have held our ground, however, they have left our systems a mess, and left behind "something" that is still functioning. I was fired from my job a month ago when they followed me to work and trashed my boss's computer systems. I have applied for work, only to find our emails misdirected, phones don't ring when called, we have been threatened, and intimidated for leaving the apartment without our phones. Our HP WiFi printer was turned into a communication hub. My wife and I have had enough! So if anyone can help, we need it.

Flags: needinfo?(nobody)

John Karahalis [:openjck]

Reporter

Updated

•

11 years ago

Whiteboard: [user-interview] → [user-interview][th]

John Karahalis [:openjck]

Reporter

Comment 39

•

11 years ago

Merlin, This is not the right place to discuss your problem. Please share your feedback below instead. If you continue to update this page your account will be banned. https://input.mozilla.org/en-US/feedback

Flags: needinfo?(nobody)

John Karahalis [:openjck]

Reporter

Updated

•

11 years ago

Whiteboard: [user-interview][th] → [user-interview][th][triaged][type:feature]

Māris Fogels [:mars] (please needinfo)

Updated

•

11 years ago

Severity: normal → enhancement

Tomislav Jovanovic :zombie

Comment 41

•

11 years ago

(In reply to Luke Crouch [:groovecoder] from bug 883967 comment #10) > * 0 new comments on bug 665750 this is really bad reasoning. it encourages making "me too" comments in bugzilla, which i'm pretty sure we don't want. anyway, we have this issue with the Addon SDK documentation ever since we moved them from our github repo to the MDN (see bug 1002307). the SDK docs are pretty self-contained, and don't depend too much on other parts of the MDN -- mainstream usage of our high-level modules doesn't require any mozilla-specific knowledge like XUL or XPConnect. so asking addon devs to download a whole 3 gigs mirror of the MDN site doesn't seem reasonable to me.

Luke Crouch [:groovecoder]

Comment 42

•

11 years ago

The reasoning is that we specifically asked visitors to comment on the bug, and no-one did. But, the experiment was done before Addon SDK docs were moved to MDN, so we could get a different result - i.e., more interest in the feature - now. Still, this is a large feature request. And there are some alternatives: http://kapeli.com/dash http://zealdocs.org/ It may be easier to ask these developers to add the Addon SDK zone to those tools.

Eric Shepherd [:sheppy]

Comment 44

•

11 years ago

We do still want this capability -- the ability to have MDN content for reading as an e-book, for example, has been a goal for years.

Andreas Eibach

Comment 45

•

11 years ago

Frankly, comment 38 reads like classic spam posts do.

Andreas Eibach

Comment 46

•

11 years ago

...Back on topic: Eric, it seems that the e-book option is one of the crucial goals that do not make bug 665750 a duplicate of bug 757461. Because to be honest, I was tempted to call this bug a duplicate too, since what we did in 757461 *was* effectively exporting a "subset of pages" (i. e. in a tarball). However, keeping the two bugs separate lets me assume the actual goals are different from each other in fact.

Ben Bucksch (:BenB)

Comment 47

•

11 years ago

Andreas, bug 757461 wasn't a subset, it was the whole thing. A "subset" is a subset that the user defines, not one for all.

Andreas Eibach

Comment 48

•

11 years ago

Ahh I see! Well, Ben, I think that's a matter of definition. :) To me it's already a "subset" once you subtract all the bulky video junk and maybe some php-generated pages. But now I see what you mean: 665750 may be aiming for single chapters of the docs, e. g. a standalone XUL set, or a standalone set about XPath (for the XUL pro who wants to learn about new ways of addressing/traversing DOM node trees) etc.

Daniele "Mte90" Scasciafratte

Comment 49

•

10 years ago

any news about this?

Luke Crouch [:groovecoder]

Comment 50

•

10 years ago

Nothing new. It's a massive project and we haven't prioritized it.

Clochix

Comment 51

•

10 years ago

Excuse-me if I'm a little bit off topic, but I just found this bug and it remained me a project I quickly hacked some month ago. CORS is available on MDN, so I use a client side spider to browse the website, with some filtering on URL, and locally save it's content inside local storage. It's not perfect but I use it daily and it fits my needs. Here's the code of this project : https://github.com/clochix/MdnHub

Szmozsánszky István [:flaki]

Comment 53

•

10 years ago

There is ongoing stemming from this weekend's HackOnMDN to bring offline caching functionality to MDN via Service Workers. More on this can be found at https://wiki-sandbox.allizom.org/MozillaWiki:Hackonmdn_serviceworkers_offline_mdn - all thoughts and ideas welcome, please chime in!

Luke Crouch [:groovecoder]

Comment 54

•

10 years ago

Thanks for the update and the work, :flaki. If you like you can file more bugs under this one for the on-going work.

BMO Automation

Comment 57

•

5 years ago

MDN Web Docs' bug reporting has now moved to GitHub. From now on, please file content bugs at https://github.com/mdn/sprints/issues/ and platform bugs at https://github.com/mdn/kuma/issues/.

Status: REOPENED → RESOLVED

Closed: 11 years ago → 5 years ago

Resolution: --- → WONTFIX

BMO Automation

Updated

•

5 years ago

Product: developer.mozilla.org → developer.mozilla.org Graveyard