Closed Bug 105966 Opened 23 years ago Closed 4 months ago

speed up rdf sequence building (bookmark parse)

Tracking

(Not tracked)

Status:

RESOLVED INCOMPLETE

People

(Reporter: tingley, Assigned: tingley)

References

Details

(Keywords: perf)

Attachments

(3 files)

initial tinkering, except for the new .idls 23 years ago Chase Tingley 11.78 KB, patch		Details \| Diff \| Splinter Review
nsIRDFHintableContainer.idl 23 years ago Chase Tingley 1.96 KB, patch		Details \| Diff \| Splinter Review
nsIRDFHintableDataSource.idl 23 years ago Chase Tingley 2.32 KB, patch		Details \| Diff \| Splinter Review

Chase Tingley

Assignee

Description

•

23 years ago

I mentioned this to Waterson last week and he wanted to see the code, so
hence this bug.  

I've been spending some time trying to cut down the time taken in the
construction of the initial bookmarks datasource when the file is parsed.
Despite the fix for bug 69009, there is still an O(n**2) algorithm inherent
in the RDF code in this case.  InMemoryDataSourceImpl::Assert() contains an
implicit duplicate check, which is currently an O(n) operation.  Even if
word done on bug 35817 decreased this to O(log n), we'd still be doing work
which is entirely pointless -- if a sequence is being build through the RDF
container AppendElement() API, there's no chance of a collision, since the
property value is incremented each time.  (Their actually will be a
collision if we roll over the 10 digits that GetNextValue() allows for
_nnn; this is a risk I'm willing to take.)

I added an "UncheckedAssert()" method to a new interface called
nsIRDFHintableDataSource (the reasons for keeping it separate have little
to do with design and a lot to do with trying to limit tree cruft).

Also, I noticed that we spend a lot of time in GetNextValue() repeatedly
loading the nextVal property, incrementing it, and writing it back.  I
theorized that if the container was made aware that a number of adds we
being made in quick succession, and bypass this datasource-thrashing.  I
added an nsIRDFHintableContainer with a pair of batching APIs, and hooked
them up to nsBookmarkService.cpp.

Numbers.  The 1000+ item bookmark list from bug 69009 and I are old friends
by now; note that the flatter the bookmarks file, the greater the speedup
(a hierarchical bookmarks file contains a series of smaller nested
sequences, on which performance isn't as bad).

Since there's issues with file caching, etc; I ran each five times in hope
of some sort of convergence.  All numbers are in thousands of ticks of the
PR_Now() clock.

Original:               523     537    540     510     519
+container batching:    405     397    390     388     410
+unchecked assertions:  351     359    370     349     369

I'll post the code shortly.  It's a work in progress; many nits such as:
- common Assert/UncheckedAssert code can be factored out
- UncheckedAssert might be better off adding to the tail, rather than head,
  of the list
- container batching could probably benefit some synchronization
- etc

Chase Tingley

Assignee

Comment 1

•

23 years ago

Attached patch initial tinkering, except for the new .idls — Details — Splinter Review

Chase Tingley

Assignee

Comment 2

•

23 years ago

Attached patch nsIRDFHintableContainer.idl — Details — Splinter Review

Chase Tingley

Assignee

Comment 3

•

23 years ago

Attached patch nsIRDFHintableDataSource.idl — Details — Splinter Review

Chase Tingley

Assignee

Comment 4

•

23 years ago

There's a less ambitious/scary alternative, which I tested at one point and I
believe generates decent numbers -- GetNextValue() should use
datasource->Change() rather than Unassert()/Assert().  This, along with
implementing LockedChange() in InMemoryDataSource, cuts the amount of time spent
updating nextVal in half.  

I've got code for this somewhere; I'll try to dig it up.

Chase Tingley

Assignee

Updated

•

23 years ago

Blocks: 7251

Keywords: perf

Chase Tingley

Assignee

Comment 5

•

23 years ago

More modest results produced by switching GetNextValue over to Change() and
adding a LockedChange() method to the InMem datasource.

Original:               523     537    540     510     519
+LockedChange           464     454    461     448     465

Andrew Hagen

Updated

•

22 years ago

Keywords: mozilla1.1

benc

Comment 6

•

21 years ago

tever is not RDF QA anymore

QA Contact: tever → nobody

Chase Tingley

Assignee

Comment 7

•

21 years ago

Clearing the mozilla1.1 keyword.

I'm not thrilled with the approach in that patch any more.  I'd rather fix this
as part of a general improvement of the container implementation that pulled
nextVal out of the graph and did a better job of optimizing lookups for ordinal
arcs.

Keywords: mozilla1.1

Worcester12345

Comment 8

•

20 years ago

(In reply to comment #7)
> Clearing the mozilla1.1 keyword.
> 
> I'm not thrilled with the approach in that patch any more.  I'd rather fix this
> as part of a general improvement of the container implementation that pulled
> nextVal out of the graph and did a better job of optimizing lookups for ordinal
> arcs.

Looking for any possible speedups in the upcoming releases.

Phil Ringnalda (:philor)

Updated

•

15 years ago

QA Contact: nobody → rdf

BMO Automation

Updated

•

6 years ago

Product: Core → Core Graveyard

u597032

Updated

•

4 months ago

Status: NEW → RESOLVED

Closed: 4 months ago

Resolution: --- → INCOMPLETE

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

speed up rdf sequence building (bookmark parse)

Categories

(Core Graveyard :: RDF, defect)

Tracking

(Not tracked)

People

(Reporter: tingley, Assigned: tingley)

References

Details

(Keywords: perf)

Crash Data

Security

(public)

User Story

Attachments

(3 files)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Updated

Comment 5

Updated

Comment 6

Comment 7

Comment 8

Updated

Updated

Updated

Attachment

General

Description

File Name

Content Type