Closed Bug 1131317 Opened 9 years ago Closed 9 years ago

[e10s] Add-on shims are breaking weak referenced observers

Categories

(Toolkit :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla38
Tracking Status
e10s m6+ ---
firefox38 --- fixed

People

(Reporter: mossop, Assigned: billm)

References

Details

Attachments

(4 files, 1 obsolete file)

Attached patch testcaseSplinter Review
If an add-on registers an observer and tells the service to hold a weak reference GC will drop the observer even if the observer is still referenced elsewhere.

STR (in add-on code):

1. Define an observer that implements nsISupportsWeakReference and keep a string reference to it
2. Register with the observer service and tell it to keep a weak reference
3. Wait for GC
4. Try calling notifying the observer
5. Try unregistering from the observer service.

When add-on shims are turned on 4 doesn't notify the observer service and 5 throws an exception. The GC has caused the observer service to drop the observer. The attached testcase demonstrates the bug.

My wild guess is that the shimmed observer service is somehow getting a wrapped version of the actual observer and nothing is keeping the wrapper alive.

Any idea what is going on here Bill?
Flags: needinfo?(wmccloskey)
Blocks: 1094392
Still not sure why this is only happening with interposition, but one problem is that the observer in the test doesn't QI to nsISupports. When I fix that, the test passes even with interposition. I'll keep looking into this though.
Flags: needinfo?(wmccloskey)
Attached patch fix-wrapped-jsSplinter Review
Thanks for the test Dave.

I guess the reason we don't get into trouble without interposition is that there are no cross-compartment wrappers involved. However, if I turn off interposition and pass the observer into a different compartment (a sandbox) and do the addObserver in that compartment, then I see the same broken behavior. So the problem isn't really related to interposition or shims.

The problem is that, for some reason, XPCWrappedJS doesn't unwrap the object if it doesn't QI to nsISupports. I don't know this code well, but it seems like maybe we could just fix that.

I do think we should fix this. It probably wasn't very common that this would happen without interposition. Now it always happens.

I pushed the patch to try, so hopefully we'll see if we're relying on the old behavior somehow.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=73d24acc336e
Assignee: nobody → wmccloskey
Status: NEW → ASSIGNED
Attachment #8562271 - Flags: review?(mrbkap)
Blocks: 1131741
Attached patch low-device-storage-test-fix (obsolete) — Splinter Review
The main patch in this bug exposes a bug in this test. Up until now, removing this observer was silently failing, which was good. With the patch we actually remove the observer, which causes us to miss the STATE_FINISHED notification.
Attachment #8562428 - Flags: review?(honzab.moz)
Attached patch add-qiSplinter Review
I have to admit I haven't totally tracked down why this code was working before. This fix is definitely correct though.

In the test failure I'm seeing, we're trying register a factory that's really a double-wrapped JS object. When we call registerFactory, we try to QI this double-wrapped thing to an nsIFactory. That goes to this code:
http://mxr.mozilla.org/mozilla-central/source/xpcom/reflect/xptcall/xptcall.cpp#19

With my patch, the comparison fails and we actually call QI on the JS object. Since the JS object doesn't define a QI method, that fails.

Without my patch, the comparison succeeds and so we just allow this object to be passed into registerFactory. The only reason I can think of for the difference is that my patch causes us to use one single XPCWrappedJS around this factory, whereas before we had multiple XPCWrappedJS wrappers, depending on which CCW you ended up with. My understanding is that mEntry->IID() is just the IID that the WrappedJS was first created with. So maybe we were getting lucky before and now we're not.
Attachment #8562432 - Flags: review?(mrbkap)
Here's a new try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=490d24e4a37b

Not sure if we should actually do this. It clearly has the potential to break some add-ons. But at least it's breakage we can explain, rather than weird GC-triggered nonsense.
Unfortunately this is leaking. Andrew, what do you think I should try next? Is there an easy way to get a CC log off try server, or should I try to reproduce locally?
Flags: needinfo?(continuation)
(In reply to Bill McCloskey (:billm) from comment #6)
> Unfortunately this is leaking. Andrew, what do you think I should try next?
> Is there an easy way to get a CC log off try server, or should I try to
> reproduce locally?

Unfortunately there is not, so I'd just try to reproduce it locally.  I'd look over the list of "Leaked URLs" in the log and see if anything looks related to particular tests, then just run those directories and see if it leaks.  I see dom/tests/mochitest/ajax/offline/test_lowDeviceStorage.html in the list, and you modified that test, so that's a good place to start.
Flags: needinfo?(continuation)
I guess we just need to remove the observer at the right time. Thanks Andrew.
Attachment #8562428 - Attachment is obsolete: true
Attachment #8562428 - Flags: review?(honzab.moz)
Attachment #8563006 - Flags: review?(honzab.moz)
Comment on attachment 8562271 [details] [diff] [review]
fix-wrapped-js

Review of attachment 8562271 [details] [diff] [review]:
-----------------------------------------------------------------

I told billm on IRC that the original code here made no sense. This is clearly what was intended.
Attachment #8562271 - Flags: review?(mrbkap) → review+
Attachment #8562432 - Flags: review?(mrbkap) → review+
Comment on attachment 8563006 [details] [diff] [review]
low-device-storage-test-fix v2

Can you review this Blake?
Attachment #8563006 - Flags: review?(honzab.moz) → review?(mrbkap)
Attachment #8563006 - Flags: review?(mrbkap) → review+
https://hg.mozilla.org/mozilla-central/rev/e34594984392
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
You need to log in before you can comment on or make changes to this bug.