Closed Bug 52144 Opened 24 years ago Closed 23 years ago

Importing 350K bookmark file takes 10-15 minutes

Categories

(SeaMonkey :: Bookmarks & History, defect, P3)

x86
All

Tracking

(Not tracked)

VERIFIED FIXED
mozilla0.9.9

People

(Reporter: jesup, Assigned: bugzilla)

References

Details

(Keywords: helpwanted, perf, Whiteboard: [nav+perf])

Attachments

(1 file)

FreeBSD 2000091008 (roughly) pull/build

I deleted my old profile today, and started from scratch.  When I opened the
"Manage bookmarks" window and imported my 4.x bookmarks file (~350K), it took
10-15 minutes (on a PIII-450 w/ 256MB) to load, during which time the browser
was locked and not refreshing.  I thought it had i-looped, so I broke it in GDB
and found that it was taking (apparently) 2-5 seconds per bookmark entry.  It
appears all sorts of expensive XUL things are happening as each item is inserted
via the observer(s) (sidebar???).

There are some serious performance/upgrade issues here... - many people have big
bookmark lists.
Nominating for nsbeta3, though I suspect it may be rejected.  Many people with
100K+ bookmark files will assume the browser has crashed and kill it or get
disgusted.
Keywords: nsbeta3, perf
Summary: Importing 350K bookmark file takes 10-15 minutes → Importing 350K bookmark file takes 10-15 minutes
Too see my bookmarks file, see bug 52149
More info - the cut-down 26K bookmarks file I'm going to upload takes ~30
seconds to import - still way too long, and 26K is NOT a long bookmarks list.
Assignee: slamm → rjc
nav triage team:
reassigning to rjc and cc'ing waterson
Sounds bad. rjc, waterson: is this a P1/P2 bug that should get approved 
nsbeta3+? Or would the fix involve major risky restructuring in RDF that we 
can't accept for PR3?
Whiteboard: [need info 9/14]
Out of curiousity, how long does it take to read in the bookmarks file (at 
startup) if its just copied into the Mozilla profile directory (i.e. 
overrighting the one that's there) ?

Core bookmarks code is probably about as fast as its going to get.

This seems like a meta-issue with RDF (how fast can assertions be 
added/modified/deleted), XUL templates (how long to match rules), and DOM (how 
quick can nodes be inserted/deleted).
rjc, i can run some test timings for you but I don't grok the scenario you mention.
could you type it real slow for me?
Sure: take the bookmark file attached to this bug and copy it into a Mozilla 
profile directory.  Then run a debug version of Mozilla, and look in the console 
window for a line which says something like "Finished reading bookmarks:  took 
____ microseconds".

I'm just curious if there is some big discrepancy between reading bookmarks at 
startup and reading them in when importing a bookmark file.
FreeBSD 4.1 20000914xx pull/build

Copied in my ns4.7 bookmarks.html (~300K+) file and started mozilla.

Finished reading in bookmarks.html  (27168214 microseconds)
i.e 27 seconds

Deleted 1 bookmark, quit, restarted:
Finished reading in bookmarks.html  (24108923 microseconds)

With the bookmarks file attached to this bug (26K):
Finished reading in bookmarks.html  (324726 microseconds)
i.e. 0.3 seconds.

It is interesting that a 300K file (10 times as large) takes 100 times as long
to load, also, but that's not what we're looking at here.

So it looks like it's "import" that's slow.
[need info] probably is no longer needed.  Let me know if you want any other
tests done.
With respect to loading 300K bookmarks in 27sec: bookmarks uses the in-memory
datasource to store stuff. It uses RDF "sequences" to maintain ordering within a
folder. The in-memory datasource sucks rocks with respect to storing large
sequences (so that would probably explain the geometric behavior you're seeing).
RJC had been working on some optimizations, but may have given up?
I think rjc said in another bug that the speed ups he was looking at actually 
turned out to be worse. Back to the drawing board, except we're out of time. 
nsbeta3-
Whiteboard: [need info 9/14] → [nsbeta3-]
The 27 seconds to read bookmarks is moderately annoying (and I'd want to see it
addressed for a final release) but what really worries me is the 10-15 minutes
for import, which seems like an RDF issue.  Am I correct?  I think we've proved
that the 10-15 minute delay isn't the bookmark read code.  Should an RDF person
be looking at this?
>  Should an RDF person be looking at this?

:^)

[Note: I'll be looking into this more.]
*** Bug 60510 has been marked as a duplicate of this bug. ***
changing OS to all ( see duped bug 60510)
OS: FreeBSD → All
question:  I wonder how much it would help performance if we had no observers on
the in memory bookmark datasource during import.

meaning, when we import bookmarks, would it help to do this:

bookmarksTree.database.RemoveDataSource(bookmarksDS);
// do the import.
bookmarksTree.database.AddDataSource(bookmarksDS);

you may have to set (or clear) the "ref" attribute of the tree before and / or
after.  I'm not sure off the top of my head.

This way, you wouldn't have notifications going to the tree on changes to the
bookmarks DS.  back in the day when subscribe used the in memory datasource,
this helped performance.

there is also the bookmarks sidebar panel.  it will also be an observer on the
bookmarks datasource.
Other observers would be the Bookmarks menu in the menubar, the bookmark's popup 
on the Personal Toolbar, etc...

The trick with this bug is probably to ensure that the batching APIs are being 
used properly. For example, check out bug # 35022... by proper usage of the 
batching APIs, bookmark drag&drop operations went from a multiple of many 
minutes to under a second.

The secondary optimization that can be made is to re-write RDF's in-memory 
datasource. (I played around with that for a while.) For example, instead of 
using hashtables and forward/backward linked lists, a faster implementation 
could use three-dimensional balanced trees.
using the rdf batching APIs correctly (like I'd know, I've never used them,
zoinks!) would have the same effect, since you wouldn't be notifying the
observers until you were all done.

time to go learn about the batching APIs...
Netscape Nav triage team: this is not a Netscape beta stopper, but we should 
make sure we have perf
Keywords: nsbeta1-
Removed old status entries.
Nominating for mozilla0.9

Comment from bug 60510:

>It actually doesn't crash the browser, it just appears to. It took me about 30 
>minutes before ns6 could import a 250k bookmark file. During that time ns6 
>stoped responding. After the import ns6 also took about 10 minutes to load each 
>time. 

Keywords: nsbeta1-, nsbeta3mozilla0.9
Whiteboard: [nsbeta3-]
-> ben
Assignee: rjc → ben
Keywords: mozilla0.9mozilla0.9.1
*** Bug 73385 has been marked as a duplicate of this bug. ***
nav triage team:

Not an ns beta stopper, marking nsbeta1-
Keywords: nsbeta1-
nav pretriage: we minused this bug. Forgot to move it to MFuture - doing that 
now. 
Target Milestone: --- → Future
The triage team has (again) deferred this bug (now future'd).  Argh.  This is
the sort of thing (apparent lockup of the application) that causes reviewers to
decide that it's fundamentally flawed.  30 _minutes_ of the application using
100% CPU?!!!!!!  With no updates of windows to tell you it will ever recover?!

Yes, I realize that this might be best solved by the outliner - however, as
mentioned here, simple proper use of batching might get this down to something
reasonable.

If we aren't going to fix this, then I'm going to open a bug stating that import
of bookmarks either be disabled, or refuse to work (or warn strenuously before
working) on any non-trivially-sized bookmarks file.
Ben created a mini branch for outliner-based bookmarks and hopes to start on it 
within the next week.  If he can't get to it, I will.  But I'm not sure how 
outliner will help here, so renominating anyways.  If we keep pushing these 
critical bugs off from release to release, they're never going to get fixed...
Keywords: nsbeta1-nsbeta1
Target Milestone: Future → ---
nav triage team:

So it looks like the best shot at fixing this would be to use the RDF batching 
API (assuming that's still available). We would like to see this fixed, but we 
don't have the resources to tackle this at the moment. Marking nsbeta1- and 
adding helpwanted.
Keywords: nsbeta1helpwanted, nsbeta1-
Depends on: 73508
Whiteboard: [nav+perf]
I think I remember Paul saying he'd look at this (the import code is the same 
as the startup/init code for default bookmarks)... currently the file loads 
synchronously & could be made asynchronous. There'd still be responsiveness 
issues without rdf batching but it could be better than it is now, when you can 
do exactly nothing while you wait... 
Assignee: ben → pchen
marking mozilla0.9.8
Target Milestone: --- → mozilla0.9.8
Has anyone tried this since the fix for bug 104328 went in?  I haven't looked to
see how the importing works, but it seems plausible that the bookmarks are just
being added one at a time -- in which case the removal of the bogus IndexOf()
implementation would speed this up immensely.
reassigning bugs dependent on 73508 to me
Assignee: pchen → blakeross
Someone try this again now that bookmarks is using outliner.
Will do, as soon as my pull/build finishes
results?
Target Milestone: mozilla0.9.8 → mozilla0.9.9
Massive improvement.

362K bookmarks file; import (with about 10 bookmarks already set): 11 seconds on
a dual PIII-450 RH Linux 7.1 system (not the same as my old tests).

Interestingly, deleting them (after using select ... shift-select to select all
the new ones) took 25 seconds.

I did a jprof of loading them; got 1000 hits with a jprof timer of .0015. ~500
of them were in InMemoryDataSource::Assert (everything seems to end up here)
which leads to nsXULTemplateBuilder::Propagate and
nsXULTemplateBuilder::FireNewlyMatchedRules.

About 1/2 the rest were UI (I started the timer while I was choosing the file).

Very much improved.  Probably can be marked as fixed; though delete is way
slower than it needs to be (it updates the screen for every item deleted it
appears; looks ugly too).
I'd hazard a guess that the delete is going through the O(n**2) renumbering
stuff in rdf -- looking at a jprof would confirm for sure.

If this is true, it might be possible to hack around the bottleneck in some
cases.    e.g. if the last n bookmarks are being deleted, try deleting them in
reverse order to minimize datasource thrashing from repeated renumberings.
We do delete them in reverse order. I'm marking this fixed, please feel free to 
file a bug on the deletion performance to bookmarks, though (and cc me).
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
mass-verifying claudius' Fixed bugs which haven't changed since 2001.12.31.

if you think this particular bug is not fixed, please make sure of the following
before reopening:

a. retest with a *recent* trunk build.
b. query bugzilla to see if there's an existing, open bug (new, reopened,
assigned) that covers your issue.
c. if this does need to be reopened, make sure there are specific steps to
reproduce (unless already provided and up-to-date).

thanks!

[set your search string in mail to "AmbassadorKoshNaranek" to filter out these
messages.]
Status: RESOLVED → VERIFIED
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: