Closed Bug 985332 Opened 11 years ago Closed 11 years ago

"TypeError: can't access dead object" when disabling and re-enabling an add-on

Categories

(Add-on SDK Graveyard :: General, defect, P2)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: zer0, Assigned: zer0)

References

Details

Attachments

(1 file)

In add-on that uses module with iframe, that are listening for "DOMContentLoaded", we disabling them and renabling them raises the following exception: System JS : ERROR resource://gre/modules/addons/XPIProvider.jsm -> jar:file:///var/folders/df/320ts25d4y5_nd8b0l97pyvh0000gn/T/tmp8wHlbL.mozrunner/extensions/jid1-TUVKU29L74r7CQ@jetpack.xpi!/bootstrap.js -> resource://extensions.modules.jid1-tuvku29l74r7cq-at-jetpack.commonjs.path/toolkit/loader.js -> resource://extensions.modules.jid1-tuvku29l74r7cq-at-jetpack.commonjs.path/sdk/system/events.js:50 - TypeError: can't access dead object The error happens during the "DOMContentLoaded", and the xpcom.js's `Subject` wrapper is not able to find the interface of the subject emitted – probably because the dead object involved? How to reproduce: The easiest way to test is create an add-on with this main.js: require('sdk/panel').Panel(); Then install the add-on (or execute `cfx run`), going to `about:addons` and disabling / renabling the add-on. Sometimes it doesn't happens the first time, but it will. Notice that the Panel is just created. It's not used or displayed. If the panel is also displayed it seems that the exception happens more often. A similar error happens in PageWorker too, but it seems on content's side instead of chrome.
Attached file test-dead.xpi
The simple add-on described in the first comment. It seems that with this code, the first time is disabled / enabled, the error never happens, but it happens from the 2nd time, all the times.
Can it be that when we disable add-on we don't kill panels? This type of error can happen it two cases: 1., a sandbox (content-script, module) set up a listener on a window, we nuke the sandbox (disable add-on) and when the event is fired, the listener will find a dead object wrapper instead of the sandbox. 2., a window is closed, and killed, but some code keeps a reference on it, and trying to use it after the window is already dead. I think this is case 1, and we should just make sure that we clean all the listeners on and from panels/workers.
Priority: P2 → --
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #2) > Can it be that when we disable add-on we don't kill panels? This type of > error can happen it two cases: > 1., a sandbox (content-script, module) set up a listener on a window, we > nuke the sandbox (disable add-on) and when the event is fired, the listener > will find a dead object wrapper instead of the sandbox. > 2., a window is closed, and killed, but some code keeps a reference on it, > and trying to use it after the window is already dead. > > I think this is case 1, and we should just make sure that we clean all the > listeners on and from panels/workers. It was my precise thought, however it seems that any listener I checked is properly removed and don't rely on a weak reference. Also, why in that case the first time we enable/disable the add-on, the exception is not raised, but only from the second time? Do you have any thought about that, gabor? I will check deeper, maybe the sandbox is nuked before the listener is removed somehow.
Debugging with the Browser Toolbox – devtools guy, I love you – it seems that actually the listener added to the panel is never removed. I'm not sure why this is happening only the second time, but I will check. So it seems that the issue is specifically to the panel and not the iframe in general, if it's that the case we probably have the same issue around other modules because we got the same exception. I'll take the bug, at least for further investigation.
Assignee: nobody → zer0
Okay, so: if I brutally remove all the panels listeners, I do not get anymore: "System JS : ERROR resource://gre/modules/addons/XPIProvider.jsm -> jar:file:///var/folders/tx/91xwh51562l5_4xcqjst9bvw0000gn/T/tmp5gNf0x.mozrunner/extensions/jid1-V1kRlbNFgAbB0g@jetpack.xpi!/bootstrap.js -> resource://extensions.modules.jid1-v1krlbnfgabb0g-at-jetpack.commonjs.path/toolkit/loader.js -> resource://extensions.modules.jid1-v1krlbnfgabb0g-at-jetpack.commonjs.path/sdk/system/events.js:50 - TypeError: can't access dead object". However, I still have `JavaScript error: , line 0: can't access dead object` multiple times, and the debugger doesn't show me any line directly related to SDK, only to sessionStore.js, before the dead object error: "chrome://browser/content/content-sessionStore.js:230 - TypeError: docShell.QueryInterface(...).sessionHistory is null" However, if I do not include panel in my test add-on, such error doesn't happens. Gabor, maybe you have some hint about that?
Flags: needinfo?(gkrizsanits)
(In reply to Matteo Ferretti [:matteo] [:zer0] from comment #3) > properly removed and don't rely on a weak reference. Also, why in that case > the first time we enable/disable the add-on, the exception is not raised, > but only from the second time? Do you have any thought about that, gabor? I have no clue, but given the messiness of the panel code I'm not even surprised, on anything any longer... > However, I still have `JavaScript error: , line 0: can't access dead object` > multiple times, and the debugger doesn't show me any line directly related > to SDK, only to sessionStore.js, before the dead object error: I don't know. It can be that we add some listener function to some native objects. And then native code is trying to call the SDK defined function but at that point, it is already replaced by a dead object wrapper, so it cannot tell what that function was... > > "chrome://browser/content/content-sessionStore.js:230 - TypeError: > docShell.QueryInterface(...).sessionHistory is null" > > However, if I do not include panel in my test add-on, such error doesn't > happens. Gabor, maybe you have some hint about that? I have no idea how can sessionHistory be null here and why :(
Flags: needinfo?(gkrizsanits)
After a bit of investigation, here the culprit: https://github.com/mozilla/addon-sdk/blob/master/lib/sdk/panel/events.js#L22-L24 As supposed, it relies on weak reference to remove the listeners, so sometimes the listeners are not gc'ed fast enough to be released, and becomes dead objects. Bug 807382 should takes care of the issue, so I'm not going to add any specific code here. So, it seems that the PageWorker issue is unrelated to the panel ones, even if the exception is the same. I will test it again both of them once bug 807382 is fixed.
Depends on: 807382
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #6) > > "chrome://browser/content/content-sessionStore.js:230 - TypeError: > > docShell.QueryInterface(...).sessionHistory is null" > > > > However, if I do not include panel in my test add-on, such error doesn't > > happens. Gabor, maybe you have some hint about that? > > I have no idea how can sessionHistory be null here and why :( Once the real culprit was fixed, that error seems disappeared too, so it was just misleading. Thanks anyway!
Matteo, is this resolved, then? Can you close the bug if that's the case?
Flags: needinfo?(zer0)
It should be fixed once the bug 807382 will be fixed – it seems the patch is not landed yet. Once bug 807382 is fixed and laded, I will retest this bug again to double check that the exception is not there anymore.
Flags: needinfo?(zer0)
No longer blocks: 987159
Since bug 807382 is fixed, I can't see this exception anymore with the code provided.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
I still get this failure after uninstalling an add-on in a browser-chrome test (the failures appear in the following tests from the run, but are caused by the add-on uninstall).
(In reply to Mihaela Velimiroviciu [QA] (:mihaelav) from comment #12) > I still get this failure after uninstalling an add-on in a browser-chrome > test (the failures appear in the following tests from the run, but are > caused by the add-on uninstall). After our chat in IRC I did some test and it seems it's an unrelated bug – the exception is the same, but the reason and the "modus operandi" is a bit different. I filed bug
No longer blocks: 973641
Sorry, part of the comment was stripped out. I filed bug 1001833, and made it blocking bug 973641. I was able to reproduce systematically the issue; I'm working on fix it now.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: