Importing 350K bookmark file takes 10-15 minutes

VERIFIED FIXED in mozilla0.9.9

Status

P3
major
VERIFIED FIXED
18 years ago
14 years ago

People

(Reporter: jesup, Assigned: bugzilla)

Tracking

({helpwanted, perf})

Trunk
mozilla0.9.9
x86
All
helpwanted, perf

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [nav+perf])

Attachments

(1 attachment)

(Reporter)

Description

18 years ago
FreeBSD 2000091008 (roughly) pull/build

I deleted my old profile today, and started from scratch.  When I opened the
"Manage bookmarks" window and imported my 4.x bookmarks file (~350K), it took
10-15 minutes (on a PIII-450 w/ 256MB) to load, during which time the browser
was locked and not refreshing.  I thought it had i-looped, so I broke it in GDB
and found that it was taking (apparently) 2-5 seconds per bookmark entry.  It
appears all sorts of expensive XUL things are happening as each item is inserted
via the observer(s) (sidebar???).

There are some serious performance/upgrade issues here... - many people have big
bookmark lists.
(Reporter)

Comment 1

18 years ago
Nominating for nsbeta3, though I suspect it may be rejected.  Many people with
100K+ bookmark files will assume the browser has crashed and kill it or get
disgusted.
Keywords: nsbeta3, perf
Summary: Importing 350K bookmark file takes 10-15 minutes → Importing 350K bookmark file takes 10-15 minutes
(Reporter)

Comment 2

18 years ago
Too see my bookmarks file, see bug 52149
(Reporter)

Comment 3

18 years ago
More info - the cut-down 26K bookmarks file I'm going to upload takes ~30
seconds to import - still way too long, and 26K is NOT a long bookmarks list.
(Reporter)

Comment 4

18 years ago
Created attachment 14437 [details]
26K bookmarks file --- around 185 URLs

Updated

18 years ago
Assignee: slamm → rjc

Comment 5

18 years ago
nav triage team:
reassigning to rjc and cc'ing waterson
Sounds bad. rjc, waterson: is this a P1/P2 bug that should get approved 
nsbeta3+? Or would the fix involve major risky restructuring in RDF that we 
can't accept for PR3?
Whiteboard: [need info 9/14]

Comment 7

18 years ago
Out of curiousity, how long does it take to read in the bookmarks file (at 
startup) if its just copied into the Mozilla profile directory (i.e. 
overrighting the one that's there) ?

Core bookmarks code is probably about as fast as its going to get.

This seems like a meta-issue with RDF (how fast can assertions be 
added/modified/deleted), XUL templates (how long to match rules), and DOM (how 
quick can nodes be inserted/deleted).

Comment 8

18 years ago
rjc, i can run some test timings for you but I don't grok the scenario you mention.
could you type it real slow for me?

Comment 9

18 years ago
Sure: take the bookmark file attached to this bug and copy it into a Mozilla 
profile directory.  Then run a debug version of Mozilla, and look in the console 
window for a line which says something like "Finished reading bookmarks:  took 
____ microseconds".

I'm just curious if there is some big discrepancy between reading bookmarks at 
startup and reading them in when importing a bookmark file.
(Reporter)

Comment 10

18 years ago
FreeBSD 4.1 20000914xx pull/build

Copied in my ns4.7 bookmarks.html (~300K+) file and started mozilla.

Finished reading in bookmarks.html  (27168214 microseconds)
i.e 27 seconds

Deleted 1 bookmark, quit, restarted:
Finished reading in bookmarks.html  (24108923 microseconds)

With the bookmarks file attached to this bug (26K):
Finished reading in bookmarks.html  (324726 microseconds)
i.e. 0.3 seconds.

It is interesting that a 300K file (10 times as large) takes 100 times as long
to load, also, but that's not what we're looking at here.

So it looks like it's "import" that's slow.
[need info] probably is no longer needed.  Let me know if you want any other
tests done.

Comment 11

18 years ago
With respect to loading 300K bookmarks in 27sec: bookmarks uses the in-memory
datasource to store stuff. It uses RDF "sequences" to maintain ordering within a
folder. The in-memory datasource sucks rocks with respect to storing large
sequences (so that would probably explain the geometric behavior you're seeing).
RJC had been working on some optimizations, but may have given up?
I think rjc said in another bug that the speed ups he was looking at actually 
turned out to be worse. Back to the drawing board, except we're out of time. 
nsbeta3-
Whiteboard: [need info 9/14] → [nsbeta3-]
(Reporter)

Comment 13

18 years ago
The 27 seconds to read bookmarks is moderately annoying (and I'd want to see it
addressed for a final release) but what really worries me is the 10-15 minutes
for import, which seems like an RDF issue.  Am I correct?  I think we've proved
that the 10-15 minute delay isn't the bookmark read code.  Should an RDF person
be looking at this?
>  Should an RDF person be looking at this?

:^)

[Note: I'll be looking into this more.]
*** Bug 60510 has been marked as a duplicate of this bug. ***
changing OS to all ( see duped bug 60510)
OS: FreeBSD → All
question:  I wonder how much it would help performance if we had no observers on
the in memory bookmark datasource during import.

meaning, when we import bookmarks, would it help to do this:

bookmarksTree.database.RemoveDataSource(bookmarksDS);
// do the import.
bookmarksTree.database.AddDataSource(bookmarksDS);

you may have to set (or clear) the "ref" attribute of the tree before and / or
after.  I'm not sure off the top of my head.

This way, you wouldn't have notifications going to the tree on changes to the
bookmarks DS.  back in the day when subscribe used the in memory datasource,
this helped performance.

there is also the bookmarks sidebar panel.  it will also be an observer on the
bookmarks datasource.
Other observers would be the Bookmarks menu in the menubar, the bookmark's popup 
on the Personal Toolbar, etc...

The trick with this bug is probably to ensure that the batching APIs are being 
used properly. For example, check out bug # 35022... by proper usage of the 
batching APIs, bookmark drag&drop operations went from a multiple of many 
minutes to under a second.

The secondary optimization that can be made is to re-write RDF's in-memory 
datasource. (I played around with that for a while.) For example, instead of 
using hashtables and forward/backward linked lists, a faster implementation 
could use three-dimensional balanced trees.
using the rdf batching APIs correctly (like I'd know, I've never used them,
zoinks!) would have the same effect, since you wouldn't be notifying the
observers until you were all done.

time to go learn about the batching APIs...
Netscape Nav triage team: this is not a Netscape beta stopper, but we should 
make sure we have perf
Keywords: nsbeta1-
(Reporter)

Comment 21

18 years ago
Removed old status entries.
Nominating for mozilla0.9

Comment from bug 60510:

>It actually doesn't crash the browser, it just appears to. It took me about 30 
>minutes before ns6 could import a 250k bookmark file. During that time ns6 
>stoped responding. After the import ns6 also took about 10 minutes to load each 
>time. 

Keywords: nsbeta1-, nsbeta3 → mozilla0.9
Whiteboard: [nsbeta3-]
(Assignee)

Comment 22

18 years ago
-> ben
Assignee: rjc → ben
Keywords: mozilla0.9 → mozilla0.9.1

Comment 23

18 years ago
*** Bug 73385 has been marked as a duplicate of this bug. ***

Comment 24

18 years ago
nav triage team:

Not an ns beta stopper, marking nsbeta1-
Keywords: nsbeta1-
nav pretriage: we minused this bug. Forgot to move it to MFuture - doing that 
now. 
Target Milestone: --- → Future
(Reporter)

Comment 26

18 years ago
The triage team has (again) deferred this bug (now future'd).  Argh.  This is
the sort of thing (apparent lockup of the application) that causes reviewers to
decide that it's fundamentally flawed.  30 _minutes_ of the application using
100% CPU?!!!!!!  With no updates of windows to tell you it will ever recover?!

Yes, I realize that this might be best solved by the outliner - however, as
mentioned here, simple proper use of batching might get this down to something
reasonable.

If we aren't going to fix this, then I'm going to open a bug stating that import
of bookmarks either be disabled, or refuse to work (or warn strenuously before
working) on any non-trivially-sized bookmarks file.
(Assignee)

Comment 27

18 years ago
Ben created a mini branch for outliner-based bookmarks and hopes to start on it 
within the next week.  If he can't get to it, I will.  But I'm not sure how 
outliner will help here, so renominating anyways.  If we keep pushing these 
critical bugs off from release to release, they're never going to get fixed...
Keywords: nsbeta1- → nsbeta1
Target Milestone: Future → ---

Comment 28

18 years ago
nav triage team:

So it looks like the best shot at fixing this would be to use the RDF batching 
API (assuming that's still available). We would like to see this fixed, but we 
don't have the resources to tackle this at the moment. Marking nsbeta1- and 
adding helpwanted.
Keywords: nsbeta1 → helpwanted, nsbeta1-

Updated

18 years ago
Depends on: 73508
(Assignee)

Updated

18 years ago
Whiteboard: [nav+perf]
I think I remember Paul saying he'd look at this (the import code is the same 
as the startup/init code for default bookmarks)... currently the file loads 
synchronously & could be made asynchronous. There'd still be responsiveness 
issues without rdf batching but it could be better than it is now, when you can 
do exactly nothing while you wait... 
Assignee: ben → pchen

Comment 30

17 years ago
marking mozilla0.9.8
Target Milestone: --- → mozilla0.9.8

Comment 31

17 years ago
Has anyone tried this since the fix for bug 104328 went in?  I haven't looked to
see how the importing works, but it seems plausible that the bookmarks are just
being added one at a time -- in which case the removal of the bogus IndexOf()
implementation would speed this up immensely.
(Assignee)

Comment 32

17 years ago
reassigning bugs dependent on 73508 to me
Assignee: pchen → blakeross
(Assignee)

Comment 33

17 years ago
Someone try this again now that bookmarks is using outliner.
(Reporter)

Comment 34

17 years ago
Will do, as soon as my pull/build finishes
(Assignee)

Comment 35

17 years ago
results?
Target Milestone: mozilla0.9.8 → mozilla0.9.9
(Reporter)

Comment 36

17 years ago
Massive improvement.

362K bookmarks file; import (with about 10 bookmarks already set): 11 seconds on
a dual PIII-450 RH Linux 7.1 system (not the same as my old tests).

Interestingly, deleting them (after using select ... shift-select to select all
the new ones) took 25 seconds.

I did a jprof of loading them; got 1000 hits with a jprof timer of .0015. ~500
of them were in InMemoryDataSource::Assert (everything seems to end up here)
which leads to nsXULTemplateBuilder::Propagate and
nsXULTemplateBuilder::FireNewlyMatchedRules.

About 1/2 the rest were UI (I started the timer while I was choosing the file).

Very much improved.  Probably can be marked as fixed; though delete is way
slower than it needs to be (it updates the screen for every item deleted it
appears; looks ugly too).

Comment 37

17 years ago
I'd hazard a guess that the delete is going through the O(n**2) renumbering
stuff in rdf -- looking at a jprof would confirm for sure.

If this is true, it might be possible to hack around the bottleneck in some
cases.    e.g. if the last n bookmarks are being deleted, try deleting them in
reverse order to minimize datasource thrashing from repeated renumberings.
(Assignee)

Comment 38

17 years ago
We do delete them in reverse order. I'm marking this fixed, please feel free to 
file a bug on the deletion performance to bookmarks, though (and cc me).
Status: NEW → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → FIXED
mass-verifying claudius' Fixed bugs which haven't changed since 2001.12.31.

if you think this particular bug is not fixed, please make sure of the following
before reopening:

a. retest with a *recent* trunk build.
b. query bugzilla to see if there's an existing, open bug (new, reopened,
assigned) that covers your issue.
c. if this does need to be reopened, make sure there are specific steps to
reproduce (unless already provided and up-to-date).

thanks!

[set your search string in mail to "AmbassadorKoshNaranek" to filter out these
messages.]
Status: RESOLVED → VERIFIED
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.