Open Bug 533290 Opened 10 years ago Updated 2 years ago

extApplication.js, _prefs object unexpectedly garbaged collected

Categories

(Core :: XPConnect, defect, major)

x86
Linux
defect
Not set
major

Tracking

()

Tracking Status
blocking2.0 --- -
blocking1.9.2 --- -
status1.9.2 --- wanted

People

(Reporter: neil, Assigned: neil)

References

()

Details

After landing attachment 412810 [details] [diff] [review] to bug 152526, one of our tests unexpectedly failed. Unfortunately I don't have a machine with both tests and debugging, so so far I have been using printf debugging, and I have found the following:
* The extApplication object successfully creates its _prefs object, which registers itself as a weak observer with the root preference branch
* At some point later when a test changes a pref, the prefapi callback notices that the weak reference has somehow gone (NS_ERROR_NULL_POINTER) although the _prefs object is only ever cleared on xpcom shutdown
* Changing the registration to a strong observer makes the problem go away
OK, so in another build I was able to breakpoint on the call that adds the weak pref observer, and by the time I had stepped through to find and set a watch on the weak observer's referent, it was time for cycle collection to fire and my referent got destroyed almost immediately.

Of course the Application._prefs variable is alive and well and quite at a loss to understand why the weak reference thinks it went away.
This seems to be happening on Linux and Mac every time and is intermittent on Windows, at least for SeaMonkey2.0 (1.9.1-based).
In any case, this is annoying, any idea what could be going on there?
Hum, let's say browser_ApplicationPrefs.js failure is bug 534647,
and this bug is to sort out the extApplication.js issue.
Severity: normal → major
Summary: TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/suite/smile/test/browser_ApplicationPrefs.js | Timed out → extApplication.js, _prefs object unexpectedly garbaged collected
Depends on: 533355
Seems to be a dupe (or maybe vice versa would be better) of bug 488587
(In reply to comment #4)
> Seems to be a dupe (or maybe vice versa would be better) of bug 488587

It does look like so: yet let's first see what happens when bug 533355 is fixed...
Neil, did bug 533355 check-in fix this bug, on trunk?
blocking2.0: --- → ?
status1.9.1: --- → ?
Flags: blocking1.9.2?
Clearing nomination flag as no blocking rationale was given. Please renominate with a reason as to why, and whether you think this is a hard release blocker (bring your "A game"!) or should be included in a stability and security update on the 1.9.2 branch.
blocking1.9.2: --- → ?
Flags: blocking1.9.2?
(In reply to comment #7)

Ftb, my concern is bug 534647 on *-1.9.1 (at least).
Fixing this bug on m-1.9.2 (on its way to m-1.9.1) could happen in an update, now :-|
blocking1.9.2: ? → -
Flags: in-testsuite?
Neil, I'm marking this a blocker but giving it to you to do more debugging here. If you're unable to do so, please provide exact steps to reproduce this problem and we'll see if we can find someone else to investigate.

My only thought so far is that what the pref service ends up holding a weak reference to is a wrapper of some sort, which ends up being GCd later on, even if the underlying object is alive and well.
Assignee: nobody → neil
blocking2.0: ? → beta1
(In reply to comment #9)
> My only thought so far is that what the pref service ends up holding a weak
> reference to is a wrapper of some sort, which ends up being GCd later on, even
> if the underlying object is alive and well.
Spot on. It's an nsXPTCStubBase if that helps. Its outer is an nsXPCWrappedJS, whose class is an nsXPCWrappedJSClass.

As for steps to reproduce problem:
1. Start the application under a debugger. (As far as I can see, this problem should occur with Thunderbird, Firefox or SeaMonkey so take your pick.)
2. In the debugger, set a breakpoint on nsPrefService::AddObserver; if you can, make it break only when aDomain is the null string. Otherwise you may want to add the breakpoint as late as possible (just before you evaluate).
3. Open the Error Console, and evaluate Application.prefs
4. Back in the debugger, set a watchpoint on the nsXPCWrappedJS's refcount. (Oh, and probably also disable the AddObserver breakpoint.)
5. Continue execution and watch as the nsXPTCStubBase dies at the next GC (probably after closing the slow script window, if my experience of debugging JS in the C++ debugger is anything to go by.)
6. Evaluate Application.prefs again, just to prove it still exists.
I am also seeing this problem, very often, with a 3.6 build. However it's not the same object. Instead it is a nsIWebProgressListener added to the individual <browser> elements of the tabbrowser. Steps followed were the same as specified by neil.

-- Break in nsDocLoader::AddProgressListener and add a watch point (Data break point) on the mProxy.

My extension adds a JS listener. I am logging every call to my onStateChange. Until the watch point is hit I see appropriate calls to onStateChange and after that there are none.

As stated earlier I encounter this bug very often and any help to get this fixed is greatly appreciated.
Blocks: 458688
blocking2.0: beta1+ → beta2+
I'm seeing browser_ApplicationPrefs.js intermittently failing when the Test Pilot extension is installed, some basic logging suggests that it is due to this (or bug 488587)
Blocks: 573079
(In reply to comment #12)
> I'm seeing browser_ApplicationPrefs.js intermittently failing when the Test
> Pilot extension is installed, some basic logging suggests that it is due to
> this (or bug 488587)

I've disabled that part of browser_ApplicationPrefs.js until this is resolved.
No longer blocks: 573079
(In reply to comment #9)
> Neil, I'm marking this a blocker but giving it to you to do more debugging
> here. If you're unable to do so, please provide exact steps to reproduce this
> problem and we'll see if we can find someone else to investigate.
> 
> My only thought so far is that what the pref service ends up holding a weak
> reference to is a wrapper of some sort, which ends up being GCd later on, even
> if the underlying object is alive and well.

(In reply to comment #11)
> I am also seeing this problem, very often, with a 3.6 build. However it's not
> the same object. Instead it is a nsIWebProgressListener added to the individual
> <browser> elements of the tabbrowser. Steps followed were the same as specified
> by neil.
> 
> -- Break in nsDocLoader::AddProgressListener and add a watch point (Data break
> point) on the mProxy.
> 
> My extension adds a JS listener. I am logging every call to my onStateChange.
> Until the watch point is hit I see appropriate calls to onStateChange and after
> that there are none.
> 
> As stated earlier I encounter this bug very often and any help to get this
> fixed is greatly appreciated.

so, is fix from bug 533355 not highly desirable in 1.9.2.x?
No, that's not related to this bug. This bug is that the weak references become stale when they shouldn't.
This *might* be related to bug 578392, which got fixed last week. Can people who have been able to reproduce this before check if it still happens after that change?

Blake, any thoughts on whether it really is related, or how to help debug what's going wrong here?
blocking2.0: beta2+ → betaN+
(In reply to comment #16)
> This *might* be related to bug 578392, which got fixed last week. Can people
> who have been able to reproduce this before check if it still happens after
> that change?
Still happens in a build made 35 hours ago.

> Blake, any thoughts on whether it really is related, or how to help debug
> what's going wrong here?
Seems unrelated; that bug is about being unable to remove observers.
In fact, given this bug, what I would expect to happen for bug 578392 is that the weak reference dies and gets removed automatically the next time someone changes the pref in question (which is practically never...)
This bug seems to have been fixed some time between

http://hg.mozilla.org/mozilla-central/rev/582be9aca672 (has the bug)

http://hg.mozilla.org/mozilla-central/rev/ff1ecd74f827 (does not have the bug)
This does not seem to be fixed in a similar case that I have.
Also, I am now seeing this problem even with pref service having strong references to my observer!
It turns out I just didn't have the correct expectation.  If I add an observer to a pref branch, the pref branch *can get garbage collected*!  Then my observer is never notified.
However, I think the PrefCallback still should hold a strong reference to the prefbranch.  Or at least, it should hold a WeakRef and notice when the PrefBranch goes away.

Follow along at home in bug 577950.
Not blocking.

Can anyone clarify if this is still a problem? Comments make it sound like the various issues here are fixed.
blocking2.0: betaN+ → -
Unless something exceptionally weird is going on here, this should be fixed by my fix for bug 750454.
(In reply to Justin Lebar (not reading bugmail) from comment #25)
> Unless something exceptionally weird is going on here, this should be fixed
> by my fix for bug 750454.

Neil, do you agree?
Flags: needinfo?(neil)
I don't see how bug 750454 affects this; the observer is still a weak observer.

I suppose I should test to see whether this has gone away anyway in the mean time.
Flags: needinfo?(neil)
I'm going to go out on a limb and suggest that this was fixed by bug 944492.
That doesn't seem too likely, but XPCJS is magical so who knows.
Actually if comment #19 is accurate this was fixed in Gecko 4; I tried and failed to reproduce it in Gecko 5.
You need to log in before you can comment on or make changes to this bug.