Closed Bug 232725 Opened 20 years ago Closed 8 days ago

Remove redundant attribute parsing from rdf content sink

Categories

(Core Graveyard :: RDF, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED INCOMPLETE
mozilla1.7alpha

People

(Reporter: tingley, Assigned: tingley)

Details

(Keywords: perf)

Attachments

(3 obsolete files)

The parser passes the RDF content sink a PRUnichar** of attributes, which it
drags aroud like ball and chain, redundantly parsing the attribute names over
and over with expensive string routines like ParseAttributeString().  We can get
vast speedup in RDF parsing by looking at the attributes once and then saving
some basic information about what was seen.
Attached patch status of tree, diff -w (obsolete) — Splinter Review
Not ready for review: it needs more testing, and it's still doing some
redundant attribute-mashing that I'd like to get rid of.
FWIW, jprof of parsing loading a huge (10k entries) downloads.rdf file (note the
huge hit count for ParseAttributeString and its child CutNameSpacePrefix)

http://www.sundell.net/~tingley/projects/mozilla/rdf/jprof-before.html
Status: NEW → ASSIGNED
Priority: -- → P1
Target Milestone: --- → mozilla1.7alpha
Attached patch better version (obsolete) — Splinter Review
This is ready for review.  The basic idea is pretty simple: at the top of
HandleStartElement, go through the attributes and grab pointers to the values
for any interesting properties we know about (rdf:about, rdf:resource, etc). 
For stuff that isn't immediately useful, save the results of
ParseAttributeString so we don't need to break it up again later.

In an opt build, this reduces the time spent in the RDF content sink by 25%
when parsing chrome.rdf, and by 20% when loading large downloads.rdf files.
Attachment #140291 - Attachment is obsolete: true
Attachment #140385 - Flags: review?(rjc)
Comment on attachment 140385 [details] [diff] [review]
better version

Actually, I think this is leaking NameSpaceEntries.
Attachment #140385 - Flags: review?(rjc)
Attached patch no longer leaking (obsolete) — Splinter Review
Same as before, but with one extra line that had gotten lost in the shuffle:

+  mNameSpaceScopes.AppendElement(mNameSpaceStack);

This was causing the NameSpaceEntries to leak, as well as making our namespace
parsing overly generous.
Attachment #140385 - Attachment is obsolete: true
Attachment #140459 - Flags: review?(rjc)
Comment on attachment 140459 [details] [diff] [review]
no longer leaking

Feh.  I should probably keep around the lenient recognition of non-namespace
qualified ID/about/resource attributes, since we only started serializing it
that way recently.
Attachment #140459 - Attachment is obsolete: true
Attachment #140459 - Flags: review?(rjc)
QA Contact: rdf
Product: Core → Core Graveyard
Status: ASSIGNED → RESOLVED
Closed: 8 days ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: