40253 - Execute the download of all links (like wget --recursive)

Reporter

Description

•

25 years ago

I think the browser, on request, should be able to download all the pages in a certain site while he/she goes and eats something, etc. The question is... How will the browser know when the site ends and others begin. I mean, if the browser wasn't given limits, it could download the WHOLE WEB! Also, it might download a site more than once. Obviously, a user would have to limit (A)number of pages downloaded (B)how many levels of links to execute (C)the domains that are allowed or a combination of the 3. Obviously, you would be able to stop it. It would also be able to recover on messed up pages. Pages predownloaded would be stored in a special cache dir and could be copied to another part of the disk to save. All images, etc. would be downloaded with the pages - therefore, you could copy a whole site to the hard disk with, of course, certain restrictions. IE - you couldn't copy cgis. Another idea I have is that someone can post a site map file to the site. The browser could then open this file and download the pages by how the sitemap file lists them. The user could even view the sitemap file and select which parts he/she wants to download. The sitemap file would contain the data to construct a tree. Each node could have a name, description, size info, and url.

R.K.Aa.

Comment 1

•

25 years ago

*** Bug 40258 has been marked as a duplicate of this bug. ***

R.K.Aa.

Comment 2

•

25 years ago

*** Bug 40254 has been marked as a duplicate of this bug. ***

R.K.Aa.

Comment 3

•

25 years ago

*** Bug 40255 has been marked as a duplicate of this bug. ***

Brian 'netdragon' Bober

Reporter

Comment 4

•

25 years ago

Sorry, I thought each time it wasn't sent.

gabriel

Comment 5

•

25 years ago

Yes I agree, this would be a very useful function to have. It would actually make offline browing a useful tool.

Eric S. Smith

Comment 6

•

25 years ago

This might become even more useful, though a UI puzzle, if one could select some specific set links to be followed. For instance, if I'm reading Freshmeat (http://www.freshmeat.net/), I might want to get the appindex record, homepage, and change-log for a given item, but not want to spider the whole fifteen-to- thirty-item page. This might also be useful in situations where one has to download a series of files by clicking their links individually, assuming that one could specify the save location once and have it apply to all of them. I'm not sure how you'd implement this UI-wise, though. Maybe a separate "link- tagging" mode, though that'd be sure to confuse a user who got into it by accident.

Equitizer

Comment 7

•

25 years ago

I had an idea - basically, the browser would build a flowchart, or tree, showing all the pages linked to by this one up to a certain point and display it. Then the user would mark the ones he/she wants downloaded. It would make this flowchart by downloading pages without the graphics.

Asa Dotzler [:asa]

Comment 8

•

25 years ago

Sorry for the spam. New QA Contact for Browser General. Thanks for your help Joseph (good luck with the new job) and welcome aboard Doron Rosenberg

QA Contact: jelwell → doronr

Blake Ross

Updated

•

25 years ago

Assignee: asa → gagan

Status: UNCONFIRMED → NEW

Component: Browser-General → Networking

Ever confirmed: true

QA Contact: doronr → tever

Blake Ross

Comment 9

•

25 years ago

rfe, confirming. sending to networking, I guess they'd be the ones to implement this if anyone

Gagan

Comment 10

•

25 years ago

->helpwanted

Assignee: gagan → nobody

Keywords: helpwanted

David Krause

Comment 11

•

24 years ago

I would recommend that whoever implements this take a look at how the unix wget program works. It has some really nice options for how links are followed, what is downloaded, etc.

Matthew Tuck [:CodeMachine]

Comment 12

•

24 years ago

There are plenty of offline download programs to look at for ideas. My question is what benefit doing this within the browser gives. Most separate offline downloading programs already give all the functionality you desire, and work well with a browser once downloaded. There seems to be a difference here between people who want a full offline downloading tool (and many already exist), and just a cache-ahead feature in the browser. The latter would make more sense, and would be a part of the normal cache and hence would eventually disappear.

Brian 'netdragon' Bober

Reporter

Comment 13

•

24 years ago

I was not aware of the offline downloading programs. I think cache ahead would be a good feature and is in other bugs. Maybe making the browser capable of offline browsing would be more trouble than its worth if such things already exist. I still think there should be a capability to download all the files on a specific page. For instance, if anyone has downloaded DJGPP back in the DOS days, there were a million links on the page to ftp files, and you had to individually click on each one. It would be nice if there was a window that came up that gave you a list of all files linked to on that page (ie - binary files) and you could download them all at once without clicking on each link. I believe that has nothing to do with offline browsing. This is what Eric S. Smith was talking about. This would especially useful for if you were at the index of some directory and wanted to download all the files in that directory.

timeless

Comment 14

•

24 years ago

In the old days (nc4, nav3gold) I abused editor to do this for me. my technique: save page. edit page: s/a href/img src/ add a rel tag so that links start in the right place save page load hacked page in composer/editor save page. watch as payload is retrieved. I think we might be able to implement this w/ just a chrome javascript in which case this bug is EASILY fixed. Brian: would you like to take a stab at it? Poor nobody is even more doomed than I. [timeless] techbot1 bug-total &bug_status=new&bug_status=assigned&assigned_to=timeless@bemail.org <techbot1> 118 bugs found. [timeless] techbot1 bug-total &bug_status=new&bug_status=assigned&assigned_to=nobody@mozilla.org <techbot1> 178 bugs found.

Brian 'netdragon' Bober

Reporter

Comment 15

•

24 years ago

Unfortunately, I ran out of Hard drive space on my laptop and had to delete the mozilla source. In about a couple weeks I will be building a computer with a 120 GB Raid drive - so that won't happen again. Until then, I can't do anything. I am also inexperienced in editing mozilla - so it might take me a while to figure out. I was going to start learning how if I hadn't run out of HD space. :(

Brian 'netdragon' Bober

Reporter

Comment 16

•

24 years ago

Ok, I'm back in business. Ummm. Sure you can assign it to me if you want. I am starting to get doomed though :-( or possibly a :-) depending on how you look at it.

Brian 'netdragon' Bober

Reporter

Comment 17

•

24 years ago

When assigning to me, realize that I have no plans of implementing this in the near future and that I can only find others to implement it for me.

benc

Comment 18

•

24 years ago

mass move, v2. qa to me.

QA Contact: tever → benc

Niels Aufbau

Updated

•

23 years ago

Summary: Execute the download of all links → Execute the download of all links (like wget --recursive)

Whiteboard: [Aufbau-P4]

Grey Hodge (jX)

Updated

•

23 years ago

Whiteboard: [Aufbau-P4]

timeless

Updated

•

23 years ago

Component: Networking → File Handling

Ivar Abrahamsen

Comment 19

•

22 years ago

Is this not the functionality Leech does on mozdev ? It has its own leaching tech as well as the option to use wget. http://leech.mozdev.org

benc

Comment 20

•

22 years ago

-> defaults

Assignee: nobody → law

QA Contact: benc → petersen

Boris Zbarsky [:bzbarsky]

Comment 21

•

22 years ago

This should be an extension, imo, but if someone does this I'm willing to review the patch.

Assignee: law → nobody

Priority: P3 → --

aynilove [So, Jae-yoon]

Comment 22

•

22 years ago

Download all link as ReGet is very useful funtion. What about make download button on Links tab in Page Info(Ctrl + I)? And make "Save all links in page..." menu in right click menu(hot menu?) that opens up above dialog would great!

Charles Fenwick

Comment 23

•

21 years ago

*** Bug 221366 has been marked as a duplicate of this bug. ***

Jon Henry

Comment 24

•

21 years ago

*** Bug 226219 has been marked as a duplicate of this bug. ***

Phil Ringnalda (:philor)

Updated

•

15 years ago

QA Contact: chrispetersen → file-handling

spiralofhope

Comment 25

•

13 years ago

There are a number of addons which provide this functionality. I propose this be closed/invalid.

Fahim

Comment 26

•

12 years ago

Attached file Hacked page (obsolete) (deleted) — Details

[Security approval request comment] How easily can the security issue be deduced from the patch? Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? Which older supported branches are affected by this flaw? If not all supported branches, which bug introduced the flaw? Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? How likely is this patch to cause regressions; how much testing does it need? [Approval Request Comment] If this is not a sec:{high,crit} bug, please state case for ESR consideration: User impact if declined: Fix Landed on Version: Risk to taking this patch (and alternatives if risky): String or UUID changes made by this patch: See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info. [Approval Request Comment] Regression caused by (bug #): User impact if declined: Testing completed (on m-c, etc.): Risk to taking this patch (and alternatives if risky): [Approval Request Comment] Bug caused by (feature/regressing bug #): User impact if declined: Testing completed (on m-c, etc.): Risk to taking this patch (and alternatives if risky): String or UUID changes made by this patch:

Attachment #684830 - Flags: ui-review-

Attachment #684830 - Flags: sec-approval?

Attachment #684830 - Flags: review-

Attachment #684830 - Flags: checkin-

Attachment #684830 - Flags: approval-mozilla-release?

Attachment #684830 - Flags: approval-mozilla-esr10?

Attachment #684830 - Flags: approval-mozilla-beta?

Attachment #684830 - Flags: approval-mozilla-aurora?

Reed Loden [:reed]

Updated

•

12 years ago

Attachment #684830 - Attachment is obsolete: true

Attachment #684830 - Attachment is patch: false

Attachment #684830 - Flags: ui-review-

Attachment #684830 - Flags: sec-approval?

Attachment #684830 - Flags: review-

Attachment #684830 - Flags: checkin-

Attachment #684830 - Flags: approval-mozilla-release?

Attachment #684830 - Flags: approval-mozilla-esr10?

Attachment #684830 - Flags: approval-mozilla-beta?

Attachment #684830 - Flags: approval-mozilla-aurora?

Reed Loden [:reed]

Comment 27

•

12 years ago

The content of attachment 684830 [details] has been deleted for the following reason: A copy of the facebook 'TLNEBF' page.

Benjamin Smedberg

Updated

•

9 years ago

Product: Core → Firefox

Version: Trunk → unspecified

BMO Automation

Updated

•

2 years ago

Severity: normal → S3