Closed Bug 569506 Opened 11 years ago Closed 2 years ago

Get rid of XPCWrappedNativeTearOffChunks

Categories

(Core :: XPConnect, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED INACTIVE

People

(Reporter: mrbkap, Unassigned)

Details

(Whiteboard: [leave open])

Attachments

(8 files)

For the longest time, I thought that XPCWrappedNativeTearoffChunks were some super-important optimization that had been carefully measured. So today, I looked up XPC_WRAPPED_NATIVE_TEAROFFS_PER_CHUNK and found that it's defined to 1, making the wrapped native list a linked list with a bunch of extra indirection and confusion. We should just get rid of XPCWrappedNativeTearoffChunks and use a linked list of XPCWrappedNativeTearoffs.
I just noticed this, and went looking to see if there was a bug on it already. :-)

This would save us 4 bytes per tearoff (XPConnect tearoff, that is), which could be significant for memshrink depending on how many tearoffs are typically alive. Anyone have any idea?
Whiteboard: [Memshrink]
Wait, nevermind, I was confused. It's memory neutral.
Whiteboard: [Memshrink]
This should be fairly straightforward for someone without XPConnect knowledge - it just requires knowledge of linked lists. Tagging good-first-bug, and CCing john in case he's interested.
Whiteboard: [good first bug]
Oh what the heck. I'll just do it. Patch coming right up.
Assignee: nobody → bobbyholley+bmo
Whiteboard: [good first bug]
So I ended up getting a bit carried away with this, and did a bunch of cleanup and reorganization of tearoff lifetime management. Things seem to work mostly, but I seem to have a very subtle bug that occurs during GC:

https://tbpl.mozilla.org/?tree=Try&rev=5c5ad550b577

I'm in the process of trying to reproduce it more reliably. In the mean time, my patches are available on github:
https://github.com/bholley/mozilla-central/commits/tearoffchunk
Did another try push here with the same code to see how reproducible the failures are:
https://tbpl.mozilla.org/?tree=Try&rev=8686a13848c5
So, many days and try pushes later, I'm totally baffled by this crash. :\

The crash occurs reliably on mochitest-1 and mochitest-o, in pushes like this one:

https://tbpl.mozilla.org/?tree=Try&rev=c4e8f1d2932a

We crash in the XPConnect callback when trying to do deferred release. In particular, we appear to be calling NS_RELEASE() on a free'd pointer.

I've narrowed it down to the line in the second-to-last ("don't recycle tearoffs") patch where we remove and delete tearoffs during sweep rather than recycling them, but I can't figure out what's wrong with that.

Blake and I talked about it on IRC, and it sounds like it makes the most sense for him to review the patches and see if he notices anything wrong with them. If nothing jumps out, I can just land all of the patches except for the naughty one, and we'll still be in a much better place.
This caught one bug, though not the one I was looking for. Still good to have though.
Attachment #585493 - Flags: review?(mrbkap)
InitTearOff() is only called once, and the callsite checks IsAvailable(), which is false for non-null mInterface members. We only call SetInterface() with a non-null value at the end of this method, so we can kill a bunch of the uses that occur beforehand.
Attachment #585497 - Flags: review?(mrbkap)
This is the patch that riggers the patch. In particular, when I comment out the two added lines in SweepTearOffs that remove and delete the tearoff, the crashes go away.
Attachment #585499 - Flags: review?(mrbkap)
I'm going to skip flagging for review here until we figure out what to do about the previous patch.
Comment on attachment 585493 [details] [diff] [review]
Part 0 - Add gc-intensive test coverage for tearoffs. v1

There's been talk about removing wrapped_native.nsIInterfaceTearoff. When we do that, we'll have to change this test.
Attachment #585493 - Flags: review?(mrbkap) → review+
Comment on attachment 585494 [details] [diff] [review]
Part 1 - Fold XPCWrappedNativeTearOffChunk into XPCWrappedNativeTearOff. v1

Review of attachment 585494 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/xpconnect/src/XPCWrappedNative.cpp
@@ +2031,5 @@
> +void
> +XPCWrappedNative::ClearAllTearOffs()
> +{
> +    XPCWrappedNativeTearOff* next;
> +    while (next = mFirstTearOff.mNext) {

I think this will warn unless you wrap the assignment here in an extra set of parentheses.
Attachment #585494 - Flags: review?(mrbkap) → review+
Comment on attachment 585495 [details] [diff] [review]
Part 2 - Remove unused 'reserved' flagging for XPCWrappedNativeTearOff. v1

The SetReserved stuff isn't actually unused: it interacts very subtly with XPCWrappedNative::FindTearOff to prevent trying to use the same tearoff twice. In particular, because of double wrapped objects, the QI call directly below this can re-enter into JS, causing us to create another tearoff for this same wrapped native.
Attachment #585495 - Flags: review?(mrbkap) → review-
Comment on attachment 585494 [details] [diff] [review]
Part 1 - Fold XPCWrappedNativeTearOffChunk into XPCWrappedNativeTearOff. v1

Review of attachment 585494 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/xpconnect/src/XPCWrappedNative.cpp
@@ +915,5 @@
>  
>      mMaybeScope = nsnull;
> +
> +    // Delete any tearoff objects in our chain.
> +    ClearAllTearOffs();

I forgot to ask: why is this call necessary now when it wasn't before? In order for us to get here, isn't it necessary for GC to have run and done this for us?
Comment on attachment 585496 [details] [diff] [review]
Part 3 - Don't embed the first entry of the tearoff list in XPCWrappedNative. v2

So, I understand the motivation to make everything uniform here, but this seems like it has the potential to increase memory thrashing. All wrappers start out with at least one tearoff and that tearoff gets recycled after each GC onto to be recreated on the first use of the wrapper after the GC. With this patch we're now going to have to malloc a bunch of extra times after each GC.
(In reply to Blake Kaplan (:mrbkap) from comment #16)
> There's been talk about removing wrapped_native.nsIInterfaceTearoff. When we
> do that, we'll have to change this test.

The motivation for those has been explained to me as allowing chrome code to disambiguate identically-named methods from different interfaces. Do we just not care enough about that case to go through all the trouble of supporting it?

Also, what's the timeframe on doing that? I might be interested in making that happen, since it would simplify all this stuff significantly.

(In reply to Blake Kaplan (:mrbkap) from comment #19)
> I forgot to ask: why is this call necessary now when it wasn't before? In
> order for us to get here, isn't it necessary for GC to have run and done
> this for us?

XPCWrappedNativeTearOffChunk had a chaining destructor, so when the XPCWrappedNative was deleted, its internal XPCWrappedNativeTearOffChunk would be deleted, which would recursively delete all the other chunks. This doesn't happen now, so we need to do it explicitly.

(In reply to Blake Kaplan (:mrbkap) from comment #18)
> The SetReserved stuff isn't actually unused: it interacts very subtly with
> XPCWrappedNative::FindTearOff to prevent trying to use the same tearoff
> twice.

Ahah - Good catch! I just noticed that nobody was querying IsAvailable() and didn't notice the intricacies of the marking mechanism.

When you pointed this out, I got excited because I thought it might explain the crash. But during the crashy patch we also stop looking for available tearoffs (since we stop recycling them), in which case it doesn't matter. Darn! But yes, needs fixing.


(In reply to Blake Kaplan (:mrbkap) from comment #20)
> So, I understand the motivation to make everything uniform here, but this
> seems like it has the potential to increase memory thrashing. All wrappers
> start out with at least one tearoff and that tearoff gets recycled after
> each GC onto to be recreated on the first use of the wrapper after the GC.
> With this patch we're now going to have to malloc a bunch of extra times
> after each GC.

The general idea of these patches is to stop recycling tearoffs, which involves removing any notion of "available" tearoffs - there's either a match, or we allocate a new one.

I went down this road because I thought it might be a memory win. But I understand if that's not the road we want to go down. Also, unless I figure out that crash, the path to killing recycling is pretty blocked.

It seems to me like there's 3 options:
* Keep recycling all tearoffs like we do now.
* Stop recycling in general, but embed the first tearoff and do recycling there.
* Stop recycling entirely.

The middle one might be a good compromise, but unfortunately is the most complicated and confusing. I'm happy to scrap the anti-recycling patches and keep embedding the first tearoff if you think it's a good idea.
(In reply to Bobby Holley (:bholley) from comment #21)
> Also, what's the timeframe on doing that? I might be interested in making
> that happen, since it would simplify all this stuff significantly.

I wouldn't count on it happening soon. I think when I said "there's been talk" it was probably one sentence spoken partially in jest.

> I went down this road because I thought it might be a memory win. But I
> understand if that's not the road we want to go down. Also, unless I figure
> out that crash, the path to killing recycling is pretty blocked.

So, I don't think that anybody's measured this since 1998 or so when it was all implemented. My intuition is that the tearoff list probably doesn't grow very big and is probably relatively constant. To me, that suggests embedding the first tearoff and recycling. Does the logging code that currently exists compile/work? How hard would it be to get numbers for things like "average number of tearoffs" and "how often are tearoffs recycled/created"?
It looks like billm is doing a bunch of GC refactoring over in bug 715761, which is going to change these patches. So I'm going to put this on hold and come back to it once his stuff lands.
Pushed the tests here, since those might as well be in the tree.

https://hg.mozilla.org/integration/mozilla-inbound/rev/267b18a4f969
Whiteboard: [leave open]
Summary: Get rid of XPCWrappedNativeTearoffChunks → Get rid of XPCWrappedNativeTearOffChunks
Assignee: bobbyholley → nobody
Per policy at https://wiki.mozilla.org/Bug_Triage/Projects/Bug_Handling/Bug_Husbandry#Inactive_Bugs. If this bug is not an enhancement request or a bug not present in a supported release of Firefox, then it may be reopened.
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → INACTIVE
You need to log in before you can comment on or make changes to this bug.