Closed
Bug 388004
Opened 18 years ago
Closed 18 years ago
Cleanup takes up to 10 minutes and 90% CPU with large number of downloads
Categories
(Toolkit :: Downloads API, defect)
Tracking
()
RESOLVED
INCOMPLETE
People
(Reporter: mariusads, Unassigned)
Details
Attachments
(1 file)
|
82.55 KB,
application/binary
|
Details |
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4
I have been hired by a company to create a backup of a website which contains about 320.000 documents of various sizes. They no longer have access to the server, the documents are generated from a database... long story short... I was actually forced to write an application that moves the mouse on the screen, clicks on the document to download it, moves the mouse again and clicks on the Next page and this process is repeated forever, for up to 20.000 documents at a time.
The pages are loaded very fast, a cycle can be done in about 3 seconds.
The problem appears usually when over 2000 documents are downloaded, Firefox starts needing a lot of time to open the download window and also needs more time to change the page after clicking on the Next button.
I suspect this is because the downloads.rdf reaches over 2 MB in size and the history.dat file is over 5.5 MB, so it probably needs more time to add entries to these files.
This causes the application moving the mouse to skip a cycle, to click twice on the download document link or twice on the Next arrow, either downloading the document twice in the first instance or skipping one completely in the other.
Because of this, after about 3000 files I stop the process, click on the Downloads link on the menu, wait up to *1 minute* to appear (is it necessary to load all entries in the Downloads page when it's shown?) and after I click on the Cleanup button, it needs about 7-8 minutes to remove all entries while using between 90 and 99 percent of the CPU (1.8Ghz,512MB DDR).
This is the biggest problem actually, it makes me think it removes one entry at a time and rewrites the downloads.rdf over and over, if it was a simply query like "delete from downloads.rdf" it would take seconds, maybe even less than a second.
I think a limit to how many files in the DOwnloads window are shown or how many are kept in the downloads.rds would be useful, same for history.dat files. I know, most people don't visit 10.000 pages or download 10.000 small files daily but it wouldn't hurt.
As a side note, after about 6.000 pages and documents downloaded Firefox consumes about 200MB of memory and about 240MB of virtual memory, which also doesn't seem right.
Reproducible: Always
Steps to Reproduce:
1. have lots of files in the Downloads window
2. click cleanup
3. wait
Actual Results:
I believe the Cleanup button should work very fast and not use 90-100% CPU while removing items from the list, right now it seems to work as if it deletes one file at a time from the list, which wouldn't make sense to me.
Expected Results:
The downloads cleanup button should work very fast, after all it removes all the entries, without any conditions (Except that it doesn't remove the files that are still downloading).
So some features could be:
* make the Cleanup button work faster and use less CPU
* implement options to restrict the number of files that are shown in the Downloads window or are kept in history or automatically remove the oldest downloaded files if the number of files exceeds a certain number
* implement options to restrict the number of pages kept in the history.dat file, so that Firefox would be more responsive throughout its use.
Thank you for taking time to read this feature request / possible bug
Comment 1•18 years ago
|
||
OFF TOPIC: why don't you use an automatic spider as http://www.httrack.com/ to download files?
could you try with the latest nightly build (Trunk/Minefield), the download manager backend has been revised, and you could tell how it performs against the old
Updated•18 years ago
|
Severity: enhancement → normal
Summary: Cleanup takes up to 10 minutes @90% on large number of files. → Cleanup takes up to 10 minutes and 90% CPU with large number of downloads
Updated•18 years ago
|
Version: unspecified → 2.0 Branch
| Reporter | ||
Comment 2•18 years ago
|
||
Thank you for answering. I have tried using MetaProducts Offline Explorer which
worked very well for other websites in the past but this one is somewhat
different. The documents I need to save can only be accessed by selecting a
subcategory that is loaded using Ajax after I select a category and after this,
I have to enter a few keywords related to that category in a form to get the
actual list of documents, one document on a page at a time, up to 35-40.000
documents one after another. If I don't enter any keywords, only the first 12
results are shown and that doesn't help...
The link is also in an iframe on the page, along with a text preview of that
documents.. it's really nasty.
So, it's not really possible to automate it as much as I'd like.
I'll try using the latest nightly build later on and I hope I won't forget to
update you people with a message. Right now it's a bit hard to change the
browser because I kind of have to reach a daily quota of downloaded files if I
want to download the whole site by the end of the month and the server used to
time out was for a few hours today (it's somewhere in Zair, South Africa, on a
relatively unstable connection). So it's a bit hard to stop the script right
now.
Comment 3•18 years ago
|
||
I would love to get a copy of your downloads.rdf file for perf testing if you were alright with that. For Firefox 3 we actually switched from using rdf to store this information to using sqlite, so I suspect there will be a large performance win.
| Reporter | ||
Comment 4•18 years ago
|
||
The downloads.rdf file compressed with rar. The domain was changed with mysite.com but the length of the old domain was exactly the same as "mysite.com" so this file has the exact number of bytes, the exact URL lengths and so on.
There are about 2012 records inside (this is the number the replace function returned).
Comment 5•18 years ago
|
||
Thanks - trunk is looking a lot better than branch, but there are still some perf issues to be looked into (namely, updating the UI)
Comment 6•18 years ago
|
||
Similar bugs were duped to bug 240525, but Shawn will probably keep this bug open for extra work on the UI.
Comment 7•18 years ago
|
||
Actually, I'm not sure what to do with this bug. We won't have to worry about the UI after the UI overhaul since the cleanup button is going away (still available in clear private data). The backend doesn't seem to have a problem handling this at all :)
Comment 8•18 years ago
|
||
Resolving old UNCONFIRMED Download Manager bugs as INCOMPLETE. If you still see this issue, please reopen.
To mark all these bug changes as read, filter on ONOMATOPOEIA.
Status: UNCONFIRMED → RESOLVED
Closed: 18 years ago
Resolution: --- → INCOMPLETE
| Assignee | ||
Updated•17 years ago
|
Product: Firefox → Toolkit
You need to log in
before you can comment on or make changes to this bug.
Description
•