Closed Bug 24762 Opened 26 years ago Closed 25 years ago

optimize RDFGenericBuilder::Is[Container|Empty]()

Tracking

(Not tracked)

Status:

VERIFIED FIXED

Milestone:

M16

People

(Reporter: waterson, Assigned: waterson)

References

Details

(Keywords: perf, Whiteboard: [PDT-])

Attachments

(2 files)

patch with optimized IsTemplateRuleMatch 25 years ago Alec Flett 1.29 KB, patch		Details \| Diff \| Splinter Review
new patch with optimized IsEmpty and IsContainer() 25 years ago Alec Flett 10.58 KB, patch		Details \| Diff \| Splinter Review

Chris Waterson

Assignee

Description

•

26 years ago

This accounts for approximately 13% of the time that it takes to open a large mail folder.

Chris Waterson

Assignee

Updated

•

26 years ago

Status: NEW → ASSIGNED

Keywords: perf

Priority: P3 → P2

Target Milestone: M14

Chris Waterson

Assignee

Updated

•

26 years ago

Blocks: 20192

Chris Waterson

Assignee

Updated

•

26 years ago

Keywords: beta1

Jim Roskind

Comment 1

•

26 years ago

PDT agrees that "loading a large folder faster" is a goal. The bug should try to say how slow we are relative to 4.x, and what our goal is. It is always scary to see a bug that just says "optimize a specific function" when it is not clear that the function *can* be optimized, and not clear what the overall time goal is (and hence, even if we wanted to optimize, when should we say enough is enough?). We do want faster performance... and bugs that say what they are after would get a PDT+ if we're outside spec, but we couldn't grok this from the bug statement. We wouldn't hold beta if this function was not optimized... but we would hold beta if our download times are too poor. That's why we ended up going PDT- on *this* bug (don't read it as "performance is not important"!) Hope that makes sense, Jim

Whiteboard: [PDT-]

Chris Waterson

Assignee

Comment 2

•

26 years ago

jar: this bug was filed as part of a fan-out of work that scottip, rjc, and i discovered while profiling "open" for a large (~2000 message) folder. if you *read the bug*, you'll see that 13% of the time is spent in these functions. go look at the mail performance stats. what else do you need to know?

Chris Waterson

Assignee

Comment 3

•

26 years ago

add jar to cc list. jar: if you're gonna comment on bugs, add yourself to the cc list.

scottputterman

Comment 4

•

26 years ago

I just wanted to verify Chris's stats. I just ran Quantify on a 5000 msg folder and IsContainer was 12.96% of the time to load a folder. On a Windows target machine 1000 msgs takes 5.28 seconds. Assuming it's proportional, which it may not be, it would take about 26 seconds for 5000 messages. So we'd save close to 3 seconds if we could eliminate most of the time.

Jim Roskind

Comment 5

•

26 years ago

Thanks for the feedback. I think the beta spec called for less than a factor of two difference in most email operations from 4.7. I tried to time the loading of my 5500 message inbox on my P300 box. (I wasn't sure if that was even the time you'd be after... as perchance I'm supposed to blow away prior mail load... but this seemed like a reasonable guess.) On 5.0, it took about 14 seconds to get into the folder. On 4.7 it took about 5 seconds to get into the folder. I was running the 1/31 comercial build. Hence this performance is well outside the range for the beta stopper. I'm pretty sure this will make PDT+ with that info... even though I wouldn't have though this performance was that bad from my gut (it measured out as being pretty different... but I never noticed this as a user :-/ ). For other performance bugs... to make the issue clearer... it would help if you would quote the current ratio to 4.7. Thanks. I removed the PDT- to get the bug reviewed again.

Whiteboard: [PDT-]

Chris Waterson

Assignee

Updated

•

26 years ago

Keywords: beta1

leger

Updated

•

25 years ago

Whiteboard: [NEED INFO]

scottputterman

Comment 6

•

25 years ago

nominating for beta1 4.7 Loading a 1000 msg pop folder = 1.93 5.0 = 7.03

Keywords: beta1

leger

Comment 7

•

25 years ago

Putting on PDT+ radar for beta1.

Whiteboard: [NEED INFO] → [PDT+]

Chris Waterson

Assignee

Updated

•

25 years ago

Whiteboard: [PDT+] → [PDT+] 2/26

Chris Waterson

Assignee

Updated

•

25 years ago

Whiteboard: [PDT+] 2/26 → [PDT+] 3/3

James Everingham

Comment 8

•

25 years ago

Waterson indicated that this will not be completed by the original 3/3 cutoff time we assigned, I am removing the PDT+ designation for reconsideration.

Whiteboard: [PDT+] 3/3

Phil Peterson

Comment 9

•

25 years ago

PDT- if no time to look at this

Whiteboard: [PDT-]

Alec Flett

Comment 10

•

25 years ago

adding myself to CC. I actually rewrote this function a long time ago and put my stuff on a branch, I'll see if I can recover the branch/remember what my optimization was.

Alec Flett

Comment 11

•

25 years ago

ok, on a positive note, someone optimized IsContainmentProperty by forcing the containment attribute to be at the template root, so that's good.

Alec Flett

Comment 12

•

25 years ago

It looks like IsContainer itself is doing what it should - expensive yes but right now I'm not sure IsContainer itself can be sped up. However, there are two places we call IsContainer. One is IsTemplateRuleMatch where we call it inside a loop with parameters which are invariant during the loop. I'm attaching a patch to optimize IsTemplateRuleMatch which was calling IsContainer in a loop on the same element/rule over and over. the patch caches the IsContainer result outside the loop.

Alec Flett

Comment 13

•

25 years ago

Attached patch patch with optimized IsTemplateRuleMatch — Details — Splinter Review

Alec Flett

Comment 14

•

25 years ago

hold the phone I looked at my changes on the branch and now I remember what my optimizations were. I'm rewriting them on the tip - I was able to speed up IsContainmentProperty pretty signifigantly, which speeds up both IsContainer and IsEmpty() and allows us to re-enable the parent-searching code that was commented out from IsContainmentProperty(), without any loss in big-O time. Almost done with my patch, I'll put it here in a bit.

Alec Flett

Comment 15

•

25 years ago

Attached patch new patch with optimized IsEmpty and IsContainer() — Details — Splinter Review

Alec Flett

Comment 16

•

25 years ago

ok, so what you see here is that I've pulled alot of operations out of loops - essentially pre-calculating things like the containment attribute for a node, outside of loops where they're needed...in the process I changed the IsEmpty() and IsContainer() functions to take this pre-calculated value, so that they didn't have to do the work themselves... the old code that was commented out is now in GetContainmentAttribute, but now that code is called only where appropriate, not inside loop iterations.

Alec Flett

Comment 17

•

25 years ago

oh, this essentially makes IsContainmentProperty a constant time operation INCLUDING support for containment= attributes on any node in the parent chain. I think we can optimize this even further... looking into that now.

Alec Flett

Comment 18

•

25 years ago

one way that would definately speed this up is if we added APIs to the nsIRDFDataSource (!) called HasArcLabelOut(nsIRDFResource *aSource nsIRDFResource *aProperty, PRBool *aResult) - essentially asking if the given arc label would have been returned in the array of arcs...

scottputterman

Comment 19

•

25 years ago

Yes, the HasArcOut function would be useful. When Chris and I were originally talking about this it looked like that would speed us up a lot since almost all of the time in this function is spent iterating over the arcsout enumerator when all we really want to know is if it has the containment property. Obviously one optimization we could make would be to return the containment properties first in mailnews but I'm sure that might slow something else down when in fact the HasArcsOut function would be even better. Chris had sent around a bunch of suggestions for how this might work with pros and cons of each one but I don't remember what they were.

Chris Waterson

Assignee

Comment 20

•

25 years ago

Yeah, we could add HasArcLabel[In|Out]. The only thing that turns my stomach is that the real semantics are "could possibly have this arc label, but maybe doesn't right now". Ugh. Another idea would be to add some kind of container-manipulation interface that you could get to from an nsIRDFDataSource.

Chris Waterson

Assignee

Updated

•

25 years ago

Target Milestone: M14 → M15

Chris Waterson

Assignee

Updated

•

25 years ago

Target Milestone: M15 → M16

Chris Waterson

Assignee

Comment 21

•

25 years ago

Add arena-based fixed-size allocator for nsInMemoryDataSource and nsCompositeDataSource to reduce time spent in ArcLabelsOut, etc. Quantify sez this shaves off about 1/3 of the time spent in this routine. Quantify also sez I can get another big boost by recycling nsAdapterEnumerator objects, so I'm poking with that now.

Chris Waterson

Assignee

Comment 22

•

25 years ago

Allright. I'm going to mark this as fixed, even though there is more work that can be done. I'm going to tackle -that- work under that auspices of bug 35022, creating a better interface for dealing with RDF containers.

Status: ASSIGNED → RESOLVED

Closed: 25 years ago

Resolution: --- → FIXED

Tom Everingham

Comment 23

•

25 years ago

marking verified per engineer's comments - profiler he is running shows improvement

Status: RESOLVED → VERIFIED

BMO Automation

Updated

•

7 years ago

Product: Core → Core Graveyard

You need to log in before you can comment on or make changes to this bug.