Closed Bug 464838 Opened 16 years ago Closed 16 years ago

Tp3 regression on 2008-11-11

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla1.9.1

People

(Reporter: bzbarsky, Assigned: mcmanus)

References

Details

(Keywords: fixed1.9.1)

Attachments

(3 files, 7 obsolete files)

patch r 1 16 years ago Patrick McManus [:mcmanus] 17.20 KB, patch	bzbarsky : review-	Details \| Diff \| Splinter Review
patch r2 16 years ago Patrick McManus [:mcmanus] 18.03 KB, patch	bzbarsky : review+ bzbarsky : superreview+	Details \| Diff \| Splinter Review
Postpone URI creation v1 16 years ago Peter Van der Beken [:peterv] 4.94 KB, patch		Details \| Diff \| Splinter Review
patch r3 16 years ago Patrick McManus [:mcmanus] 5.35 KB, patch		Details \| Diff \| Splinter Review
patch r3-v2 16 years ago Patrick McManus [:mcmanus] 7.39 KB, patch	bzbarsky : review-	Details \| Diff \| Splinter Review
r3.v3 16 years ago Patrick McManus [:mcmanus] 7.68 KB, patch	bzbarsky : review+ bzbarsky : superreview+	Details \| Diff \| Splinter Review
roll up patch for 1.9.1 16 years ago Patrick McManus [:mcmanus] 19.81 KB, patch		Details \| Diff \| Splinter Review
roll up patch for moz central 16 years ago Patrick McManus [:mcmanus] 9.96 KB, patch		Details \| Diff \| Splinter Review
roll up patch for moz central v2 16 years ago Patrick McManus [:mcmanus] 11.37 KB, patch	bzbarsky : review+ bzbarsky : superreview+	Details \| Diff \| Splinter Review
roll up patch for 1.9.1 v2 16 years ago Patrick McManus [:mcmanus] 20.67 KB, patch		Details \| Diff \| Splinter Review

Boris Zbarsky [:bzbarsky]

Reporter

Description

•

16 years ago

We seem to have a Tp3 regression on at least some of the Mac and Linux machines (e.g. qm-plinux-fast03). The checkin range is: http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2008-11-11+19%3A00%3A00&enddate=2008-11-11+20%3A00%3A00 Out of those, I'd expect the DNS prefetch to be the most likely culprit. Not sure we need to worry about this for 1.9.1b2, but we should figure out what's up here. One test I'd suggest is commenting out the prefetch call in nsHTMLAnchorElement::BindToTree to see what impact, if any, it has. It's my best guess for the regression source.

Flags: blocking1.9.1?

Patrick McManus [:mcmanus]

Assignee

Comment 1

•

16 years ago

Though I have tried to understand just what tp3 is composed of, I didn't quite conquer that mountain. I put together some of my own measurements, using -tp and -tpcycles, and depending on page can see a 1-2% average slow down with the patch removed, and using a well connected 100% hit proxy. Though the stddev on my tests tends to be well over 5% so I'm not sure I put a heck of a lot of faith in the average. As I mentioned earlier, that's basically a test of overhead for this patch - there is no upside to prefetching DNS when using a proxy. Anyhow, I will attach a patch that seems to speed things considerably up. On my tests it gets us within 1 or so ms on a page load that has an average of around 800ms. The measurements are a bit volatile, but it is a clear improvement on the trunk. It does this by: * removing the allocations and hash lookups from the bindtotree() path. These are turned into an insertion into a static fifo and only when the document finishes loading are all the entries in the fifo pushed through the dns cache in a batch. Modeled after the existing link prefetch code. * the async-open early lookup code short circuits itself in case a proxy is being used. This makes sense to do, but didn't help the benchmark notably in isolation. If talos doesn't agree, it would be helpful to identify what particular pages might be showing the slowdown so they could be analyzed more closely.

Patrick McManus [:mcmanus]

Assignee

Comment 2

•

16 years ago

Attached patch patch r 1 (obsolete) — Details — Splinter Review

Assignee: nobody → mcmanus

Attachment #350108 - Flags: review?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Reporter

Comment 3

•

16 years ago

> Though I have tried to understand just what tp3 is composed of, I didn't quite > conquer that mountain. What were the obstacles? Getting your hands on the pageset? > it would be helpful to identify what particular pages might be showing the > slowdown This data was available from Tinderbox when this bug was filed (and might still be available). It's in the Tinderbox log. One question here. Is the main effect from doing the lookups in a batch, or from limiting the number of lookups to 512?

Patrick McManus [:mcmanus]

Assignee

Comment 4

•

16 years ago

> One question here. Is the main effect from doing the lookups in a batch, or > from limiting the number of lookups to 512? batch plus deferral.. 256 gave almost same results

Shawn Wilsher :sdwilsh

Updated

•

16 years ago

Priority: -- → P2

Target Milestone: --- → mozilla1.9.1

Boris Zbarsky [:bzbarsky]

Reporter

Updated

•

16 years ago

Attachment #350108 - Flags: review?(bzbarsky) → review-

Boris Zbarsky [:bzbarsky]

Reporter

Comment 5

•

16 years ago

Comment on attachment 350108 [details] [diff] [review] patch r 1 >+++ b/content/html/content/src/nsHTMLAnchorElement.cpp >@@ -224,17 +208,24 @@ nsHTMLAnchorElement::BindToTree(nsIDocum >+ // Prefetch links >+ if (nsHTMLDNSPrefetch::IsAllowed(GetOwnerDoc())) { It's probably worth checking aDocument here too, so we don't do prefetch stuff on various mutations of disconnected subtrees. It would also be nice to not prefetch more than once for a given node, so we don't keep doing prefetch during DOM mutations of existing stuff in general. Maybe file a followup bug on this? >+++ b/content/html/content/src/nsHTMLDNSPrefetch.cpp >+nsHTMLDNSPrefetch::Prefetch(nsAString &hostname, PRUint16 flags) So unlike the URI version this doesn't do the queuing business, right? That might be worth documenting in the header. >+ nsCString hostname8; >... >+ return sDNSService->AsyncResolve(hostname8, flags, Just pass NS_ConvertUTF16toUTF8(hostname) directly to asyncResolve and skip the temporary? >+nsDNSPrefetchDeferrals::nsDNSPrefetchDeferrals() >+ : mActiveLoaderCount(0) >+{ >+ mLock = PR_NewLock(); >+ ClearQueue(); This seems to be the only caller of ClearQueue. Since ClearQueue does nothing of the sort (e.g. leaves refs to all the URIs) and instead just sets some members, maybe it should go away and we should just set the members here in the initializer list? Why do we need a lock here at all? Our OnLookupComplete can happen on random threads, but all other calls into this method should be on the main thread. We can add asserts to that effect if we want (and that might be a good idea). >+nsDNSPrefetchDeferrals::~nsDNSPrefetchDeferrals() >+ RemoveProgressListener(); You actually don't want to do that, since it'll try to call virtual methods on this object, etc. Given that the web progress holds a reference to us via an nsIWeakReference, there's no need to remove on destruction. >+nsDNSPrefetchDeferrals::Add(PRUint16 flags, nsIURI *aURI) >+ if (((mFireAndForgetHead + 1) & sMaxDeferredMask) == mFireAndForgetTail) >+ return NS_ERROR_DNS_LOOKUP_QUEUE_FULL; I think |(mFireAndForgetHead + 1) % sMaxDeferred| is much clearer. Even clearer would be using an existing sane queue class, which we sadly don't have. nsDeque would work if you could stuff your flags in the low bits of a URI pointer, but you need 3 flags. Please file a bug to update this stuff to a templated queue class once we have one? I'm pretty sure we already have a bug to create such a class, but if not let me know and I'll file it. >+ mFireAndForgetHead = (mFireAndForgetHead + 1) & sMaxDeferredMask; Again, I think explicitly using % would be clearer. >+nsDNSPrefetchDeferrals::SubmitQueue() >+ mFireAndForgetTail = (mFireAndForgetTail + 1) & sMaxDeferredMask; And here. >+nsDNSPrefetchDeferrals::RemoveProgressListener() This method should just go away. >+nsDNSPrefetchDeferrals::OnStateChange(nsIWebProgress* aWebProgress, >+ if (mActiveLoaderCount) >+ mActiveLoaderCount--; I think you should be able to assert mActiveLoaderCount > 0 here (before the decrement) instead of checking it. >+nsDNSPrefetchDeferrals::OnLookupComplete(nsICancelable *request, Maybe put this one before all the progress listener methods, instead of in between some of them? +++ b/content/html/content/src/nsHTMLDNSPrefetch.h +class nsDNSPrefetchDeferrals : public nsIDNSListener This could be an inner class of nsHTMLDNSPrefetch, no? + ~nsDNSPrefetchDeferrals(); virtual, please +++ b/netwerk/protocol/http/src/nsHttpChannel.cpp + if (!(mConnectionInfo && mConnectionInfo->UsingHttpProxy())) { Take out one of the two spaces after the &&? Also, probably worth commenting that there is no point doing DNS prefetch when using a proxy, since we'd just be resolving the proxy's hostname? r- because of the progress listener removal issue, which needs to be fixed no matter what.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 6

•

16 years ago

(In reply to comment #5) > Even clearer would be using an existing sane queue class, which we sadly don't > have. nsDeque would work if you could stuff your flags in the low bits of a > URI pointer, but you need 3 flags. > > Please file a bug to update this stuff to a templated queue class once we have > one? I'm pretty sure we already have a bug to create such a class, but if not > let me know and I'll file it. We could at some point try using the standard library. I'm not sure exactly what conclusion we reached the last time we discussed it, but I vaguely remember it being "sure, if somebody wants to try being the first...". We should of course keep an eye on codesize and performance numbers, etc., and make sure the mobile guys are ok with it. > + ~nsDNSPrefetchDeferrals(); > > virtual, please Destructors of XPCOM object implementations should generally be *either* virtual or private. Is there any reason to pick virtual here over private? It seems like it would be better of as private and non-virtual.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 7

•

16 years ago

Yeah, private destructor would be fine too.

Patrick McManus [:mcmanus]

Assignee

Comment 8

•

16 years ago

Boris, David, Thanks for the reviews and making this stronger. Most of the feedback is a slam dunk, but I want to ask a couple questions before finalizing the next spin of the patch. > Why do we need a lock here at all? Our OnLookupComplete can happen on random > threads, but all other calls into this method should be on the main thread. good to know. The lock is probably the most expensive thing in the add path, so it would be good to get rid of. I was under the impression wholly different DOMs might be processed on different threads - and there is only one global dns deferral queue thus the need for the lock. I guess I am mistaken? And the statechange notification also comes in on that thread? How would I have worked that out for myself? (a doc I missed or forgot, perhaps?) > >+ if (((mFireAndForgetHead + 1) & sMaxDeferredMask) == mFireAndForgetTail) > I think |(mFireAndForgetHead + 1) % sMaxDeferred| is much clearer. How strongly do you feel about this? My version reads very naturally to me, probably because I have used that idiom a million times. I like it because I know every compiler will generate the & while I have seen the latter turn into divides even in the presence of maskable constants. If you feel strongly it is a fish out of water wrt the other code, I'll change it. Just let me know. >+nsDNSPrefetchDeferrals::OnStateChange(nsIWebProgress* aWebProgress, > >+ if (mActiveLoaderCount) > >+ mActiveLoaderCount--; > > I think you should be able to assert mActiveLoaderCount > 0 here (before the > decrement) instead of checking it. It was modeled on nsPrefetchService which has this comment // // at initialization time we might miss the first DOCUMENT START // notification, so we have to be careful to avoid letting our // stop count go negative. That made a lot of sense to me. Do you think it isn't a valid concern?

Boris Zbarsky [:bzbarsky]

Reporter

Comment 9

•

16 years ago

All DOMs are always on the main thread at the moment, as is OnStateChange. I'm not sure where, if anywhere, this is clearly documented. :( > How strongly do you feel about this? Not that strongly. I'm fine with keeping it, esp. since I hope we'll end up with a standard queue class being used here at some point. Hmm. Interesting point about init time... Given that, probably better to leave that code as-is, yes.

Patrick McManus [:mcmanus]

Assignee

Comment 10

•

16 years ago

Attached patch patch r2 (obsolete) — Details — Splinter Review

New patch incorporates review comments. Follow-on bug regarding templated queue class to follow.

Attachment #350108 - Attachment is obsolete: true

Attachment #351004 - Flags: review?(bzbarsky)

Patrick McManus [:mcmanus]

Assignee

Comment 11

•

16 years ago

> Please file a bug to update this stuff to a templated queue class once we have > one? I'm pretty sure we already have a bug to create such a class, but if not > let me know and I'll file it. > I wasn't able to find such a bug. I searched for queue, fifo, and even "linked list" and they didn't turn up a candidate. Though https://bugzilla.mozilla.org/show_bug.cgi?id=442584 seems to agree with you :)

Boris Zbarsky [:bzbarsky]

Reporter

Comment 12

•

16 years ago

Comment on attachment 351004 [details] [diff] [review] patch r2 r+sr=bzbarsky.

Attachment #351004 - Flags: superreview+

Attachment #351004 - Flags: review?(bzbarsky)

Attachment #351004 - Flags: review+

Daniel Holbert [:dholbert]

Comment 13

•

16 years ago

Patch r2 pushed: http://hg.mozilla.org/mozilla-central/rev/a0c0ed9f461f

Status: NEW → RESOLVED

Closed: 16 years ago

Resolution: --- → FIXED

Boris Zbarsky [:bzbarsky]

Reporter

Comment 14

•

16 years ago

Can someone verify that this actually made Tp3 go down on the relevant machines? I don't seem to see anything other than a big jump on some machines yesterday there....

Shawn Wilsher :sdwilsh

Comment 15

•

16 years ago

(In reply to comment #14) > Can someone verify that this actually made Tp3 go down on the relevant > machines? I don't seem to see anything other than a big jump on some machines > yesterday there.... I don't see any changes to the graphs.

Patrick McManus [:mcmanus]

Assignee

Comment 16

•

16 years ago

Links to the specific pages and logs you are speaking of would be very helpful

Boris Zbarsky [:bzbarsky]

Reporter

Comment 17

•

16 years ago

Here are the tp3 graphs for all the machines running that test: http://graphs.mozilla.org/#show=394872,394887,394900,395008,395020,395048,395125,395135,395166,787095,787118,787129,794379,794396,794407,911655,911694,912148,1431032,1431170,1431214,1431848,1431867&sel=1225860045,1228654546 Here's the tp3 graph for the machine I mentioned in comment 0 showing the regression and the lack of regression going away: http://graphs.mozilla.org/#show=911694&sel=1226041126,1228665420 To get logs, you'd have to load the Firefox tinderbox tinderbox, go back in time (using the link at the bottom) to the right day, then click the "L" link on the correct builds. Hovering the graph should tell you the relevant build start times. It's unfortunate that this part is so annoying. :( In any case, reopening, since qm-plinux-fast03 still shows a regression.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

chris hofmann

Comment 18

•

16 years ago

interested in seeing if this fixes the problem I'm seeing with https://bugzilla.mozilla.org/show_bug.cgi?id=423377#c34 though #c37

Patrick McManus [:mcmanus]

Assignee

Comment 19

•

16 years ago

(In reply to comment #18) > interested in seeing if this fixes the problem I'm seeing with > https://bugzilla.mozilla.org/show_bug.cgi?id=423377#c34 though #c37 Take a look at 467562 - I put a patch in there yesterday hoping someone who can repro it would try it out.

Boris Zbarsky [:bzbarsky]

Reporter

Updated

•

16 years ago

Depends on: 467648

Josh Aas

Updated

•

16 years ago

OS: Mac OS X → All

Peter Van der Beken [:peterv]

Comment 20

•

16 years ago

Attached patch Postpone URI creation v1 — Details — Splinter Review

The innerHTML part of the Dromaeo modify test (http://dromaeo.com/?dom-modify$) has regressed, and the creation of URIs for DNS prefetching in BindToTree is showing up in profiles as the culprit. This patch moves that code from BindToTree to run from an event. Will probably help with the Tp3 regression too.

Patrick McManus [:mcmanus]

Assignee

Comment 21

•

16 years ago

Attached patch patch r3 (obsolete) — Details — Splinter Review

Peter - thank you for identifying the URI creation as the primary culprit here. That's a huge help. On my desktop that innerhtml test measures about 100ms times with the 1.9.1 branch, and about 79ms with the combination of r2 + your patch (c#20) applied. (r2 is currently applied on moz-central, but not 1.9.1.. your patch requires r2). With the code in bindtotree() just commented out (i.e. disable prefetch totally) innerhtml runs in about 79ms. I'm going to suggest patch r3 (attached, applies on top of r2) instead of the c#20 patch. Your patch builds a deferred queue, eventually run by the main event loop. R2 already introduced a deferred queue, eventually run on pageload completion.. so we end up with a deferred queue on top of a deferred queue :( The difference was the R2 queue deferred the DNS lookups, but it didn't defer the nsURI creation. The R3 patch makes that change (essentially deferring nsGenericHTMLElements instead of nsURIs). R3 (on top of R2) performs as well as the event deferral patch on dromaeo, so hopefully we've cracked this nut.

Attachment #355437 - Flags: review?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Reporter

Comment 22

•

16 years ago

So with that last patch, if the node is being added after the page has loaded (As in dromaeo), we end up holding a reference to the node in the prefetch queue until some other page loads, no? I was OK with that for URIs, because the queue size is limited, so the memory consumption is bounded. But if we're going to keep nodes alive, that entrains all sorts of other objects, possibly for arbitrarily long periods of time. I'd really rather we didn't do that...

Patrick McManus [:mcmanus]

Assignee

Comment 23

•

16 years ago

hm. Well, I kinda feel like the event loop could have the opposite problem (not wait long enough), right? Is there anything you can think of to trigger this off of?

Patrick McManus [:mcmanus]

Assignee

Comment 24

•

16 years ago

(In reply to comment #23) > hm. Well, I kinda feel like the event loop could have the opposite problem (not > wait long enough), right? > > Is there anything you can think of to trigger this off of? if there isn't a good answer to that I guess we could just go a different direction and walk the tree at pageload complete, building the uris then and being done with it. It wouldn't capture changes made through js, but that's probably ok for this..

Boris Zbarsky [:bzbarsky]

Reporter

Comment 25

•

16 years ago

pageload complete, breaking out when the queue is full, might work, yes. You could use an expiration tracker (set for 1s or so, say) to decide when to trigger otherwise, maybe.

Patrick McManus [:mcmanus]

Assignee

Comment 26

•

16 years ago

Attached patch patch r3-v2 (obsolete) — Details — Splinter Review

Updated r3 patch based on the first revision, normally the queue is submitted (and uri's built) on any pageload completion. This version adds a 2 second one-shot timer if an entry is added to the queue while no pageload is in progress (and there is no timer currently outstanding) - the queue is processed if the timer expires before another page begins loading.

Attachment #355437 - Attachment is obsolete: true

Attachment #355510 - Flags: review?(bzbarsky)

Attachment #355437 - Flags: review?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Reporter

Comment 27

•

16 years ago

Comment on attachment 355510 [details] [diff] [review] patch r3-v2 I should note that I suspect we'll want to back this out and figure out a different approach for this later, one that would extend to the async history stuff we want to do... >+++ b/content/html/content/src/nsHTMLDNSPrefetch.cpp >+nsHTMLDNSPrefetch::nsDeferrals::Tick (nsITimer *aTimer, void *aClosure) Take out the space before '(', please. >+ nsHTMLDNSPrefetch::nsDeferrals *self = (nsHTMLDNSPrefetch::nsDeferrals *) aClosure; What guarantees that |self| hasn't been destroyed yet? Should we be canceling the timer on destruction? >+ NS_ASSERTION (self->mTimerArmed, "Timer is not armed"); Nix the space before '('. >+ // If the queue is not submitted here beacuse there are outstanding pages being loaded, "because" >+++ b/content/html/content/src/nsHTMLDNSPrefetch.h >+ static void Tick (nsITimer *aTimer, void *aClosure); Nix space before '('. >+ nsCOMPtr<nsGenericHTMLElement> mElement; nsRefPtr, please. I'd like to see the lifetime issue addressed before marking review.

Attachment #355510 - Flags: review?(bzbarsky) → review-

Patrick McManus [:mcmanus]

Assignee

Comment 28

•

16 years ago

Attached patch r3.v3 (obsolete) — Details — Splinter Review

udpated with review comments from comment 27. The cancel() is indeed important, it was at one time included in the patch and I somehow lost it - thanks for the catch. I dropped in nsRefPtr and assume it works just as nsCompPtr does for non COM objects? I could not find a strong definition for it. MDC calls the two "similar". I updated the formatting bits as well. Is there any corollary to linux's checkpatch.pl for the mozilla style? http://www.mjmwired.net/kernel/Documentation/SubmittingPatches

Attachment #355510 - Attachment is obsolete: true

Attachment #355765 - Flags: review?(bzbarsky)

Benjamin Smedberg

Comment 29

•

16 years ago

blocking for b3 due to risk

Flags: blocking1.9.1? → blocking1.9.1+

Priority: P2 → P1

Boris Zbarsky [:bzbarsky]

Reporter

Comment 30

•

16 years ago

Comment on attachment 355765 [details] [diff] [review] r3.v3 r+sr=bzbarsky. nsRefPtr works just like nsCOMPtr for reference counting, but without the special magic for calling QI, and without the need for the object it's used on to be implementing nsISupports: it just needs to have AddRef and Release methods. And yeah, it should be used for any refcounted non-interface pointer. http://beaufour.dk/jst-review/ is something that someone wrote up at some point, though it doesn't seem to catch that whitespace thing.

Attachment #355765 - Flags: superreview+

Attachment #355765 - Flags: review?(bzbarsky)

Attachment #355765 - Flags: review+

Boris Zbarsky [:bzbarsky]

Reporter

Comment 31

•

16 years ago

So... that last patch doesn't apply on m-c. Patrick, want to post a merged patch for checkin? And a 1.9.1 patch too?

Patrick McManus [:mcmanus]

Assignee

Comment 32

•

16 years ago

(In reply to comment #31) > So... that last patch doesn't apply on m-c. Patrick, want to post a merged > patch for checkin? And a 1.9.1 patch too? Yes, I will prepare both later today

Patrick McManus [:mcmanus]

Assignee

Comment 33

•

16 years ago

Attached patch roll up patch for 1.9.1 (obsolete) — Details — Splinter Review

This is a roll up patch that should apply cleanly to 1.9.1. It contains: * r2 patch - which adds a deferral queue of uri (already on moz central) * patch from bug 467648 which fixed a leak introduced by r2 (already on moz central).. this must be included here because r3 touches the same dtor * r3 patch - which changes the deferrals from uris to elements and adds a one-shot timer to make sure they are flushed in a reasonable amount of time

Patrick McManus [:mcmanus]

Assignee

Comment 34

•

16 years ago

Attached patch roll up patch for moz central (obsolete) — Details — Splinter Review

This is a roll up patch that should apply cleanly to moz-central. It contains: * r3 patch - which changes the deferrals from uris to elements and adds a one-shot timer to make sure they are flushed in a reasonable amount of time (this is a slightly different version that r3.3 in order to merge cleanly with moz central) m-c already contains the r2 and 467648 patches

Boris Zbarsky [:bzbarsky]

Reporter

Comment 35

•

16 years ago

Pushed http://hg.mozilla.org/mozilla-central/rev/b73e063a3f99 Let's see what the numbers look like.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 36

•

16 years ago

And backed out because of leaks. On the bright side, Tp did come down!

Patrick McManus [:mcmanus]

Assignee

Comment 37

•

16 years ago

when I just run firefox and exit it, I don't see any leak. when I runtests.py --chrome --autorun --close-when-done I do see a leak (just as reported by tinderbox) the key difference in execution seems to be that under the chrome test nsLayoutStatics::Shutdown() (and therefore nsHTMLDNSPrefetch::Shutdown() which is called from it) is not run. nsLayoutStatics::Initialize() is run for both invocations. That results in the leak. Any thoughts on why that is?

Boris Zbarsky [:bzbarsky]

Reporter

Comment 38

•

16 years ago

Hmm. nsLayoutStatics::Shutdown won't happen until all documents are destroyed, so if you're holding element pointers it won't get called. You probably need to register an xpcom-shutdown observer of your own to clean up the elements....

Patrick McManus [:mcmanus]

Assignee

Comment 39

•

16 years ago

(In reply to comment #38) > Hmm. nsLayoutStatics::Shutdown won't happen until all documents are destroyed, > so if you're holding element pointers it won't get called. > > You probably need to register an xpcom-shutdown observer of your own to clean > up the elements.... That was a huge time saver. Thank you. I now get the same leak results on moz-central for --chrome --browser-chrome and --a11y. --browser-chrome does report some leaks, but they are the same with and without the patch, the others run clean. A bit later I will post a new rollup patch for moz-central.. if that gets a green light I'll do one for 191. I'm optimistic as the tp number worked out yesterday. fyi this is the diff vs the last diff for comment - it just flushes the references on xpcom-shutdown as well as in the dtor. Leaving them in the dtor is redundant but feels cleanly and harmless. this will be included in the rollup, but it is easier to see the differneces isolated: http://www.ducksong.com/mozilla/misc/dns-shutdown-changes.010909.txt

Boris Zbarsky [:bzbarsky]

Reporter

Comment 40

•

16 years ago

You don't want the RemoveObserver call in your destructor. It'll get removed anyway, since it's a weak ref, and if you do it this way you'll potentially crash because RemoveObserver will make virtual function calls on you.

Patrick McManus [:mcmanus]

Assignee

Comment 41

•

16 years ago

Attached patch roll up patch for moz central v2 — Details — Splinter Review

v2 of roll up patch for moz central that drops element refs on xpcom-shutdown. includes RemoveObserver adjustment from comment 40.

Attachment #355843 - Attachment is obsolete: true

Patrick McManus [:mcmanus]

Assignee

Comment 42

•

16 years ago

(In reply to comment #40) > You don't want the RemoveObserver call in your destructor. It'll get removed > anyway, since it's a weak ref, and if you do it this way you'll potentially > crash because RemoveObserver will make virtual function calls on you. so a weak ref is just a ptr that magically goes null when the object it point to destructs - but it doesn't add a count to that object?

Boris Zbarsky [:bzbarsky]

Reporter

Comment 43

•

16 years ago

That's correct. The observer service will still hold a pointer the the nsIWeakReference even after your object dies, but do_QueryReferent will return null. Once that happens, the observer service will drop its pointer to the nsIWeakReference.

Boris Zbarsky [:bzbarsky]

Reporter

Updated

•

16 years ago

Depends on: 473089

Boris Zbarsky [:bzbarsky]

Reporter

Updated

•

16 years ago

Attachment #356256 - Flags: superreview+

Attachment #356256 - Flags: review+

Boris Zbarsky [:bzbarsky]

Reporter

Comment 44

•

16 years ago

Pushed the roll-up patch v2 as http://hg.mozilla.org/mozilla-central/rev/5a819cbcef1e

Status: REOPENED → RESOLVED

Closed: 16 years ago → 16 years ago

Resolution: --- → FIXED

Patrick McManus [:mcmanus]

Assignee

Comment 45

•

16 years ago

Attached patch roll up patch for 1.9.1 v2 — Details — Splinter Review

This is the 1.9.1 version of the patch that was applied to moz central in comment 44. If the tests for that apply go ok, this can be applied to 1.9.1 the patch in 473089 is still needed in 1.9.1 on top of this.

Attachment #351004 - Attachment is obsolete: true

Attachment #355765 - Attachment is obsolete: true

Attachment #355842 - Attachment is obsolete: true

Boris Zbarsky [:bzbarsky]

Reporter

Comment 46

•

16 years ago

Looks like this helped Tp3 somewhat. Not all the way back down, but close; and there have been other changes since then in other parts of the code, of course.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 47

•

16 years ago

Pushed http://hg.mozilla.org/releases/mozilla-1.9.1/rev/049bd5ae4aad on 1.9.1 (the 1.9.1 roll-up patch).

Keywords: fixed1.9.1

You need to log in before you can comment on or make changes to this bug.