Loading bookmarks is O(n**2)

VERIFIED FIXED in mozilla0.9

Status

defect
VERIFIED FIXED
19 years ago
15 years ago

People

(Reporter: dveditz, Assigned: waterson)

Tracking

({perf})

Trunk
mozilla0.9
All
Windows NT

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments)

Reporter

Description

19 years ago
Loading bookmarks--which happens at startup--has O(n**2) behavior. On my 
windows machine my accumulated ~250 bookmarks takes about 2.5% of my normal 6 
second startup. With a 1111 bookmark file my startup takes 42 seconds, and with 
larger files it gets worse quickly.

The culprit appears to be BookmarkParser::ParseBookmarkInfo which gets called 
once per bookmark and calls RDFContainerImpl::IndexOf() to weed out duplicate 
bookmarks. 85% of the time spent in the 1111 IndexOf() calls is in 797,956 
calls to InMemoryAssertionEnumeratorImpl::HasMoreElements, and 10% of the 
time is in 398,978 calls to RDFContainerUtilsImpl::IndexToOrdinalResource.  

(there were an additional 401,223 calls to HasMoreElements from 
InMemoryAssertionEnumeratorImple::GetNext but they account to only a tenth of a 
percent of the time spent in that function). There were an astounding 224 
MILLION calls to Assertion::Release, which is a quick routine but in that 
volume still managed to account for about 8% of the total startup time.

The sad thing is that there are NEVER any duplicates on startup because thye've 
long since been weeded out (except in the rare case someone has hand-edited 
their bookmarks file).

Might be related to the crash in bug 46882, although if the guys bookmark file 
is really close to a meg I'm surprised it ever finished -- I tried an 8000 item 
bookmarks file and gave up after 10 minutes or so trying to start the product. 
Probably also related to bug 62907, high CPU use with large bookmarks from all 
those IndexOf calls.

bug 10198 assigned to Ben is an odd one, it says we should warn the user if he 
tries to add a duplicate, but the current code will simply make it go away.

bug 52199 reports that adding duplicates fails silently, and asks (without ever 
receiving an answer) if this was a formal UI decision and if so why? Good 
question! In light of the performance cost in making them go away I hope we 
have a very good reason.

If we really do have to strip duplicate URLs (and again I have to ask why) then 
we might want to switch the algorithm to match on a hash table rather than 
using O(n**2) RDFContainer::IndexOf. It'd cost more memory (especially on a 
large bookmark file which is where we'd need the performance gain) but its 
memory we could probably toss at the end of the bookmark parsing step. Doing an 
IndexOf() when adding a single bookmark might not kill us as badly (though of 
course we need to verify that).
Reporter

Comment 1

19 years ago

Comment 2

19 years ago
IMHO:

1) why check for dups at startup? If the user duplicated a bookmark that's their
own problem, besides there may be reasons why dups are wanted in complex
heirarchies (Shopping->Computers->Bookstores->Amazon, MyBooks->Amazon, but
obviously more complex).

2) how do we determine a duplicate? Based on what keys? URL? Even this may
prevent the user from doing what they want to do.

3) should be looking at 2) when we decide if a new bookmark addition is a
duplicate. I'm not convinced dup-checking is needed at all, at least not off the
top of my head.

Just some thoughts.

Comment 3

19 years ago
perf keyword, sorry for spam
Keywords: perf

Comment 4

19 years ago
It is a little bit of topic, but i checked what files mozilla (including
bookmarks) opens on startup. 
Results are below. Isn't it strange, that every file is opened 3 times ??? Maybe
there is way to reduce it, so mozilla will start faster.




mhankus opened file .mozilla/Default/izb81h28.slt/prefs.js read=Yes write=No
(numopen=1)
  mhankus opened file .mozilla/Default/izb81h28.slt/prefs.js read=Yes write=No
(numopen=1)
  mhankus opened file .mozilla/Default/izb81h28.slt/prefs.js read=Yes write=No
(numopen=1)
  mhankus opened file .mozilla/Default/izb81h28.slt/chrome/user-skins.rdf
read=Yes write=No (numopen=2)
  mhankus opened file .mozilla/Default/izb81h28.slt/chrome/user-skins.rdf
read=Yes write=No (numopen=2)
  mhankus opened file .mozilla/Default/izb81h28.slt/chrome/user-skins.rdf
read=Yes write=No (numopen=2)
  mhankus opened file .mozilla/Default/izb81h28.slt/chrome/user-locales.rdf
read=Yes write=No (numopen=3)
  mhankus opened file .mozilla/Default/izb81h28.slt/chrome/user-locales.rdf
read=Yes write=No (numopen=3)
  mhankus opened file .mozilla/Default/izb81h28.slt/chrome/user-locales.rdf
read=Yes write=No (numopen=3)
  mhankus opened file .mozilla/Default/izb81h28.slt/localstore.rdf read=Yes
write=No (numopen=4)
  mhankus opened file .mozilla/Default/izb81h28.slt/localstore.rdf read=Yes
write=No (numopen=4)
  mhankus opened file .mozilla/Default/izb81h28.slt/localstore.rdf read=Yes
write=No (numopen=4)
  mhankus opened file .mozilla/Default/izb81h28.slt/bookmarks.html read=Yes
write=No (numopen=5)
  mhankus opened file .mozilla/Default/izb81h28.slt/bookmarks.html read=Yes
write=No (numopen=5)
  mhankus opened file .mozilla/Default/izb81h28.slt/bookmarks.html read=Yes
write=No (numopen=4)
  mhankus opened file .mozilla/Default/izb81h28.slt/search.rdf read=Yes write=No
(numopen=5)
  mhankus opened file .mozilla/Default/izb81h28.slt/search.rdf read=Yes write=No
(numopen=5)
  mhankus opened file .mozilla/Default/izb81h28.slt/search.rdf read=Yes write=No
(numopen=4)
  mhankus opened file .mozilla/Default/izb81h28.slt/history.dat read=Yes
write=No (numopen=5)
  mhankus opened file .mozilla/Default/izb81h28.slt/history.dat read=Yes
write=No (numopen=5)
  mhankus opened file .mozilla/Default/izb81h28.slt/history.dat read=Yes
write=Yes (numopen=4)
  mhankus opened file .mozilla/Default/izb81h28.slt/cookies.txt read=Yes
write=No (numopen=5)
  mhankus opened file .mozilla/Default/izb81h28.slt/cookies.txt read=Yes
write=No (numopen=5)
  mhankus opened file .mozilla/Default/izb81h28.slt/cookies.txt read=Yes
write=No (numopen=5)
  mhankus opened file .mozilla/Default/izb81h28.slt/panels.rdf read=Yes write=No
(numopen=5)
  mhankus opened file .mozilla/Default/izb81h28.slt/panels.rdf read=Yes write=No
(numopen=5)
  mhankus opened file .mozilla/Default/izb81h28.slt/panels.rdf read=Yes write=No
(numopen=4)
Reporter

Updated

19 years ago
Blocks: 7251

Comment 5

19 years ago
stripping duplicate bookmarks is BAD! I shall not be denied my God-given right
to make duplicate bookmarks!

But I do understand the reasoning behind it. It isnt difficult to accadentaly
create a bookmark either by slip-of-the-mouse or by hitting CTRL+D , and there
are many people out there who, having created a bookmark, would have no clue how
to remove it (yes, indeed. I support many such users at my company. They are
invarably the same ones who always have two or three "New Folder"s scatterd
around their desktop and start menu :)

Anyway, the new prompt-for-info-when-adding-a-bookmark dialogue (which doesnt
appear in my current build for some reason, but I know I have seen it) protects
such people from themselves far better than any silly duplicate stripping would.
No longer blocks: 7251

Comment 6

19 years ago
Another thought: I can see why you might not want duplicate bookmarks in the
same folder, so would it help performance much if you only search for a
duplicate in the same bookmark folder?

Comment 7

19 years ago
> If we really do have to strip duplicate URLs (and again I have to ask why)
unfortunately we do, because RDF is using URLs as Unique references.

99% of us probably agree that the bug lies there and should be fixed.
Assignee

Comment 8

19 years ago
This appears to have been a kludge put in by rjc to fix bug 51021. I suspect 
that removing this will cause some people to start seeing duplicate bookmarks 
again, and we'll have to then fix that problem the right way.

http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&ro
ot=/cvsroot&subdir=mozilla/xpfe/components/bookmarks/src&command=DIFF_FRAMESET&f
ile=nsBookmarksService.cpp&rev2=1.146&rev1=1.145
Assignee

Comment 9

19 years ago
Assignee

Updated

19 years ago
Target Milestone: --- → mozilla0.9

Comment 10

19 years ago
I won't argue when and if for stripping duplicate bookmarks -
but know this: allowing duplicate bookmarks opens a huge can of
worms, I don't think the code handles them well at all. There are
already a bunch of bugs in the system related to this. In a nutshell
creating/manipulating a dupe bookmark typically resulted in the user
losing one silently under several different scenarios.
Assignee

Comment 11

19 years ago
Yeah, I remember those bugs. But I never understood why we had them: we're using
RDF sequences to maintain the list of bookmarks in a folder, so (in theory) we
should be able to distinguish >1 bookmark in the same folder with the same URL.
I think some of those bugs might have been related to copying folders from one 
place to another, which in the old FE would just add the copied folder resource 
into the destination, creating an alias to the original rather than a bona fide 
copy. The new FE actually creates a new anonymous resource for the folder on a 
copy, and places the contents into the new resource. 
This patch seems fine, given that in a container, adding a duplicate would 
yield:

                      |  -- _1 --> http://www.foo.com/
                      |
  Container Resource  |  -- _2 --> http://www.bar.com/
                      |  
                      |  -- _3 --> http://www.foo.com/

Delete 3, 

                      |  -- _1 --> http://www.foo.com/
                      |
  Container Resource  |  -- _2 --> http://www.bar.com/
                      |  
                      |  -- X ---> http://www.foo.com/

http://www.foo.com/ still exists in the graph, as item _1. _1 and _3 are 
completely in sync, and are updated as such (change a property on _1 and it 
changes on _3). This is a bug, but seems harmless enough. 

r=ben@netscape.com for the patch to remove the hack. 

Comment 14

19 years ago
oops, thought I sr'ed here already. darn bugzilla security!
Assignee

Comment 15

19 years ago
Fix checked in. Be vigilant for the re-appearance of bug 51021.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED

Comment 16

19 years ago
Possible re-occurence of bug 51021.  I had a bookmark file grow to 1.8MB the
other day, with 2 or 3 entries in one folder duplicated over and over. 
Unfortunately, I have no idea what caused it, and it isn't happening again.  But
a heads-up that, indeed, it may be possible to trigger it.
Oy. 

After some investigation, I discovered that this seems to be causing 74969 
(importing a bookmarks file into itself causes the file to blow up). Seems to be 
related to duplicates. The old checking code was preventing this. 

Rather than just adding this old code back, here's a proposed solution:

Bookmarks folders are currently being stored with an ID set to the anonymous 
resource URI. Scrap this. Make the resource URI be generated for each folder as 
it is created. This means that the new folder structure that is imported is 
unique.
mass-verifying claudius' Fixed bugs which haven't changed since 2001.12.31.

if you think this particular bug is not fixed, please make sure of the following
before reopening:

a. retest with a *recent* trunk build.
b. query bugzilla to see if there's an existing, open bug (new, reopened,
assigned) that covers your issue.
c. if this does need to be reopened, make sure there are specific steps to
reproduce (unless already provided and up-to-date).

thanks!

[set your search string in mail to "AmbassadorKoshNaranek" to filter out these
messages.]
Status: RESOLVED → VERIFIED
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.