Closed Bug 1789227 Opened 2 years ago Closed 2 years ago

Intermittent browser/base/content/test/pageinfo/browser_pageinfo_separate_private.js | single tracking bug

Categories

(Firefox :: Page Info Window, defect, P5)

defect

Tracking

()

RESOLVED FIXED
109 Branch
Tracking Status
firefox109 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: florian)

References

Details

(Keywords: intermittent-failure, intermittent-testcase, Whiteboard: [stockwell unknown])

Attachments

(1 file)

Filed by: ccozmuta [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=389483199&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Td8V-2S1QiCj-XT51QwPhw/runs/0/artifacts/public/logs/live_backing.log


[task 2022-09-05T08:42:58.666Z] 08:42:58     INFO - GECKO(2254) | [RDD 2265, Main Thread] WARNING: NS_ENSURE_TRUE(Preferences::InitStaticMembers()) failed: file /builds/worker/checkouts/gecko/modules/libpref/Preferences.cpp:4604
[task 2022-09-05T08:42:58.670Z] 08:42:58     INFO - GECKO(2254) | [Parent 2254, Compositor] WARNING: IPC Connection Error: [Parent][PImageBridgeParent] RunMessage(msgname=PImageBridge::Msg_WillClose) Channel closing: too late to send/recv, messages will be lost: file /builds/worker/checkouts/gecko/ipc/glue/MessageChannel.cpp:1882
[task 2022-09-05T08:42:59.389Z] 08:42:59     INFO - GECKO(2254) | [Parent 2254: Main Thread]: I/DocShellAndDOMWindowLeak --DOMWINDOW == 11 (1433add40) [pid = 2254] [serial = 54] [outer = 0] [url = chrome://browser/content/pageinfo/pageInfo.xhtml]
[task 2022-09-05T08:42:59.393Z] 08:42:59     INFO - GECKO(2254) | [Parent 2254: Main Thread]: I/DocShellAndDOMWindowLeak --DOMWINDOW == 10 (16fc5c000) [pid = 2254] [serial = 55] [outer = 0] [url = about:blank]
[task 2022-09-05T08:42:59.393Z] 08:42:59     INFO - GECKO(2254) | [Parent 2254: Main Thread]: I/DocShellAndDOMWindowLeak --DOMWINDOW == 9 (1358cb000) [pid = 2254] [serial = 4] [outer = 0] [url = about:blank]
[task 2022-09-05T08:42:59.394Z] 08:42:59     INFO - GECKO(2254) | [Parent 2254: Main Thread]: I/DocShellAndDOMWindowLeak --DOMWINDOW == 8 (10b821870) [pid = 2254] [serial = 3] [outer = 0] [url = chrome://browser/content/browser.xhtml]
[task 2022-09-05T08:42:59.394Z] 08:42:59     INFO - GECKO(2254) | [Parent 2254: Main Thread]: I/DocShellAndDOMWindowLeak --DOMWINDOW == 7 (10b822c60) [pid = 2254] [serial = 10] [outer = 0] [url = chrome://mochikit/content/browser-harness.xhtml]
[task 2022-09-05T08:42:59.395Z] 08:42:59     INFO - GECKO(2254) | [Parent 2254: Main Thread]: I/DocShellAndDOMWindowLeak --DOMWINDOW == 6 (1426cdc00) [pid = 2254] [serial = 11] [outer = 0] [url = about:blank]
[task 2022-09-05T08:42:59.395Z] 08:42:59     INFO - GECKO(2254) | [Parent 2254: Main Thread]: I/DocShellAndDOMWindowLeak --DOMWINDOW == 5 (133faa400) [pid = 2254] [serial = 65] [outer = 0] [url = about:blank]
[task 2022-09-05T08:42:59.396Z] 08:42:59     INFO - GECKO(2254) | [Parent 2254: Main Thread]: I/DocShellAndDOMWindowLeak --DOMWINDOW == 4 (10b81f090) [pid = 2254] [serial = 5] [outer = 0] [url = chrome://extensions/content/dummy.xhtml]
[task 2022-09-05T08:42:59.396Z] 08:42:59     INFO - GECKO(2254) | [Parent 2254: Main Thread]: I/DocShellAndDOMWindowLeak --DOMWINDOW == 3 (13940c800) [pid = 2254] [serial = 7] [outer = 0] [url = chrome://extensions/content/dummy.xhtml]
[task 2022-09-05T08:42:59.397Z] 08:42:59     INFO - GECKO(2254) | [Parent 2254: Main Thread]: I/DocShellAndDOMWindowLeak --DOMWINDOW == 2 (1358c6c00) [pid = 2254] [serial = 2] [outer = 0] [url = about:blank]
[task 2022-09-05T08:42:59.397Z] 08:42:59     INFO - GECKO(2254) | [Parent 2254: Main Thread]: I/DocShellAndDOMWindowLeak --DOMWINDOW == 1 (10b821c10) [pid = 2254] [serial = 1] [outer = 0] [url = chrome://browser/content/hiddenWindowMac.xhtml]
[task 2022-09-05T08:42:59.397Z] 08:42:59     INFO - GECKO(2254) | [Parent 2254: Main Thread]: I/DocShellAndDOMWindowLeak --DOMWINDOW == 0 (10b821de0) [pid = 2254] [serial = 8] [outer = 0] [url = about:blank]
[task 2022-09-05T08:42:59.415Z] 08:42:59     INFO - GECKO(2254) | [Parent 2254, Main Thread] WARNING: Extra shutdown CC: 'i < NORMAL_SHUTDOWN_COLLECTIONS', file /builds/worker/checkouts/gecko/xpcom/base/nsCycleCollector.cpp:3359
[task 2022-09-05T08:42:59.430Z] 08:42:59     INFO - GECKO(2254) | [Parent 2254, Main Thread] WARNING: NS_ENSURE_TRUE(InitStaticMembers()) failed: file /builds/worker/workspace/obj-build/dist/include/mozilla/Preferences.h:129
[task 2022-09-05T08:42:59.450Z] 08:42:59     INFO - GECKO(2254) | [Parent 2254, Main Thread] WARNING: NS_ENSURE_TRUE(Preferences::InitStaticMembers()) failed: file /builds/worker/checkouts/gecko/modules/libpref/Preferences.cpp:4604
[task 2022-09-05T08:42:59.471Z] 08:42:59     INFO - TEST-INFO | Main app process: exit 0
[task 2022-09-05T08:42:59.471Z] 08:42:59     INFO - TEST-INFO | Confirming we saw 79 DOCSHELL created and 79 destroyed log strings.
[task 2022-09-05T08:42:59.472Z] 08:42:59     INFO - TEST-INFO | Confirming we saw 197 DOMWINDOW created and 197 destroyed log strings.
[task 2022-09-05T08:42:59.472Z] 08:42:59    ERROR - TEST-UNEXPECTED-FAIL | browser/base/content/test/pageinfo/browser_pageinfo_separate_private.js | leaked 1 window(s) until shutdown [url = chrome://browser/content/pageinfo/pageInfo.xhtml]
[task 2022-09-05T08:42:59.472Z] 08:42:59    ERROR - TEST-UNEXPECTED-FAIL | browser/base/content/test/pageinfo/browser_pageinfo_separate_private.js | leaked 1 window(s) until shutdown [url = about:blank]
[task 2022-09-05T08:42:59.473Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_separate_private.js | windows(s) leaked: [pid = 2254] [serial = 54], [pid = 2254] [serial = 55]
[task 2022-09-05T08:42:59.473Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_separate_private.js | This test created 0 hidden window(s)
[task 2022-09-05T08:42:59.474Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_separate_private.js | This test created 1 hidden docshell(s)
[task 2022-09-05T08:42:59.474Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_firstPartyIsolation.js | This test created 0 hidden window(s)
[task 2022-09-05T08:42:59.475Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_firstPartyIsolation.js | This test created 1 hidden docshell(s)
[task 2022-09-05T08:42:59.475Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_iframe_media.js | This test created 0 hidden window(s)
[task 2022-09-05T08:42:59.476Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_iframe_media.js | This test created 1 hidden docshell(s)
[task 2022-09-05T08:42:59.476Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_image_info.js | This test created 0 hidden window(s)
[task 2022-09-05T08:42:59.476Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_image_info.js | This test created 1 hidden docshell(s)
[task 2022-09-05T08:42:59.477Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_images.js | This test created 0 hidden window(s)
[task 2022-09-05T08:42:59.477Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_images.js | This test created 1 hidden docshell(s)
[task 2022-09-05T08:42:59.477Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_permissions.js | This test created 0 hidden window(s)
[task 2022-09-05T08:42:59.478Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_permissions.js | This test created 1 hidden docshell(s)
[task 2022-09-05T08:42:59.478Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_rtl.js | This test created 0 hidden window(s)
[task 2022-09-05T08:42:59.479Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_rtl.js | This test created 1 hidden docshell(s)
[task 2022-09-05T08:42:59.479Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_security.js | This test created 0 hidden window(s)
[task 2022-09-05T08:42:59.479Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_security.js | This test created 1 hidden docshell(s)
[task 2022-09-05T08:42:59.480Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_svg_image.js | This test created 0 hidden window(s)
[task 2022-09-05T08:42:59.480Z] 08:42:59     INFO - TEST-INFO | browser/base/content/test/pageinfo/browser_pageinfo_svg_image.js | This test created 1 hidden docshell(s)
[task 2022-09-05T08:42:59.481Z] 08:42:59     INFO - runtests.py | Application ran for: 0:00:41.657818
[task 2022-09-05T08:42:59.481Z] 08:42:59     INFO - zombiecheck | Reading PID log: /var/folders/4j/yrl637111cxgs124qj5z2q08000014/T/tmprhvx2ivfpidlog
[task 2022-09-05T08:42:59.481Z] 08:42:59     INFO - ==> process 2254 launched child process 2255
[task 2022-09-05T08:42:59.482Z] 08:42:59     INFO - ==> process 2254 launched child process 2256
[task 2022-09-05T08:42:59.482Z] 08:42:59     INFO - ==> process 2254 launched child process 2261
[task 2022-09-05T08:42:59.482Z] 08:42:59     INFO - ==> process 2254 launched child process 2262
[task 2022-09-05T08:42:59.483Z] 08:42:59     INFO - ==> process 2254 launched child process 2263
[task 2022-09-05T08:42:59.483Z] 08:42:59     INFO - ==> process 2254 launched child process 2264
[task 2022-09-05T08:42:59.484Z] 08:42:59     INFO - ==> process 2254 launched child process 2265
[task 2022-09-05T08:42:59.484Z] 08:42:59     INFO - ==> process 2254 launched child process 2266
[task 2022-09-05T08:42:59.484Z] 08:42:59     INFO - ==> process 2254 launched child process 2267
[task 2022-09-05T08:42:59.485Z] 08:42:59     INFO - ==> process 2254 launched child process 2268
[task 2022-09-05T08:42:59.485Z] 08:42:59     INFO - ==> process 2254 launched child process 2269
[task 2022-09-05T08:42:59.485Z] 08:42:59     INFO - ==> process 2254 launched child process 2270

Update:

There have been 38 failures within the last 7 days:

  • 6 failures on Windows 10 x64 2004 WebRender debug
  • 4 failures on Windows 10 x64 2004 asan WebRender opt
  • 6 failures on OS X 10.15 WebRender debug
  • 9 failures on Linux 18.04 x64 WebRender tsan opt
  • 5 failures on Linux 18.04 x64 WebRender Shippable opt
  • 7 failures on Linux 18.04 x64 WebRender debug/opt
  • 1 failure on Linux 18.04 x64 WebRender asan op

Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=398553825&repo=autoland&lineNumber=6985

[task 2022-12-03T13:24:28.285Z] 13:24:28     INFO - TEST-PASS | browser/base/content/test/pageinfo/browser_pageinfo_separate_private.js | non-private window opened private page info window - false === false - 
[task 2022-12-03T13:24:28.285Z] 13:24:28     INFO - Buffered messages logged at 13:24:26
[task 2022-12-03T13:24:28.286Z] 13:24:28     INFO - Console message: [JavaScript Warning: "This page is in Quirks Mode. Page layout may be impacted. For Standards Mode use “<!DOCTYPE html>”." {file: "https://example.com/" line: 0}]
[task 2022-12-03T13:24:28.286Z] 13:24:28     INFO - Buffered messages finished
[task 2022-12-03T13:24:28.287Z] 13:24:28     INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/pageinfo/browser_pageinfo_separate_private.js | private window opened non-private page info window - "undefined" === true - JS frame :: chrome://mochitests/content/browser/browser/base/content/test/pageinfo/browser_pageinfo_separate_private.js :: <TOP_LEVEL> :: line 32
[task 2022-12-03T13:24:28.287Z] 13:24:28     INFO - Stack trace:
[task 2022-12-03T13:24:28.287Z] 13:24:28     INFO - chrome://mochitests/content/browser/browser/base/content/test/pageinfo/browser_pageinfo_separate_private.js:null:32
[task 2022-12-03T13:24:28.287Z] 13:24:28     INFO - TEST-PASS | browser/base/content/test/pageinfo/browser_pageinfo_separate_private.js | private and non-private windows shouldn't have shared the same page info window - [object Window] != [object Window] - 

Florian, as the owner of this component, can you help us assign it to someone?
Thank you.

Flags: needinfo?(florian)
Whiteboard: [stockwell needswork:owner]

Using MOZ_CHAOSMODE=0xfb ./mach test browser/base/content/test/pageinfo/browser_pageinfo_separate_private.js --run-until-failure --headless --profiler, I can reproduce about half the time on my optimized mac local build.

I have profiles for both the "TEST-UNEXPECTED-FAIL - private window opened non-private page info window - "undefined" === true" case, and the "TEST-UNEXPECTED-FAIL - non-private window opened private page info window - "undefined" === false" case.

In both cases, the "BrowserTestUtils - waitForEvent: page-info-init" marker took significantly longer than usual right before the failure, and contained a significant amount of CPU time spent on GCMajor, reason = COMPARTMENT_REVIVED.

In most cases, printing window.docShell gives [xpconnect wrapped (nsIDocShell, nsILoadContext)]. Right before the failure, it gives [xpconnect wrapped nsIDocShell]. Adding a .QueryInterface(Ci.nsILoadContext) call before accessing docShell.usePrivateBrowsing fixes the test failure. (Profile verifying this: https://share.firefox.dev/3Bfp2nj)

I don't exactly know why the GCMajor removes the nsILoadContext interface from the docshell object (not clear to me if it's a bug or the expected behavior), but I verified that the nsILoadContext interface existed on the object seen from https://searchfox.org/mozilla-central/rev/70bc66e002ad1cdb1fe5d77428803432f261038a/browser/base/content/pageinfo/pageInfo.js#274 even before the QI call on that line.

I think adding the QI call to the test is harmless and we should go ahead with it. It will fix most failures of this test, but not the debug-only Mac-only leak of the window, that happened a total of 14 times in the entire history of the bug: https://treeherder.mozilla.org/intermittent-failures/bugdetails?startday=2022-08-09&endday=2022-12-07&tree=trunk&failurehash=5941df80b710ef56d32823cbfaf4624d3c13e9295ba34546dbc8b0435b26b602&bug=1789227

Flags: needinfo?(florian)
Blocks: 1766636
Assignee: nobody → florian
Status: NEW → ASSIGNED
Attachment #9307057 - Attachment description: Bug 1789227 - browser_pageinfo_separate_private.js - Query the nsILoadContext interfare before accessing it on the docShell object, r=Gijs. → Bug 1789227 - browser_pageinfo_separate_private.js - Query the nsILoadContext interface before accessing it on the docShell object, r=Gijs.

(In reply to Florian Quèze [:florian] from comment #13)

Using MOZ_CHAOSMODE=0xfb ./mach test browser/base/content/test/pageinfo/browser_pageinfo_separate_private.js --run-until-failure --headless --profiler, I can reproduce about half the time on my optimized mac local build.

I have profiles for both the "TEST-UNEXPECTED-FAIL - private window opened non-private page info window - "undefined" === true" case, and the "TEST-UNEXPECTED-FAIL - non-private window opened private page info window - "undefined" === false" case.

In both cases, the "BrowserTestUtils - waitForEvent: page-info-init" marker took significantly longer than usual right before the failure, and contained a significant amount of CPU time spent on GCMajor, reason = COMPARTMENT_REVIVED.

In most cases, printing window.docShell gives [xpconnect wrapped (nsIDocShell, nsILoadContext)]. Right before the failure, it gives [xpconnect wrapped nsIDocShell]. Adding a .QueryInterface(Ci.nsILoadContext) call before accessing docShell.usePrivateBrowsing fixes the test failure. (Profile verifying this: https://share.firefox.dev/3Bfp2nj)

I don't exactly know why the GCMajor removes the nsILoadContext interface from the docshell object (not clear to me if it's a bug or the expected behavior), but I verified that the nsILoadContext interface existed on the object seen from https://searchfox.org/mozilla-central/rev/70bc66e002ad1cdb1fe5d77428803432f261038a/browser/base/content/pageinfo/pageInfo.js#274 even before the QI call on that line.

I think adding the QI call to the test is harmless and we should go ahead with it. It will fix most failures of this test, but not the debug-only Mac-only leak of the window, that happened a total of 14 times in the entire history of the bug: https://treeherder.mozilla.org/intermittent-failures/bugdetails?startday=2022-08-09&endday=2022-12-07&tree=trunk&failurehash=5941df80b710ef56d32823cbfaf4624d3c13e9295ba34546dbc8b0435b26b602&bug=1789227

Two questions for you, Andrew, based on this diagnosis from Florian.

First, am I right in thinking that it's unexpected that previously-QI'd interfaces stop being exposed through xpconnect/wrappers/whatever-the-technical-term-is when a GC happens? If so can you perhaps file a follow-up bug in the right component (unsure - JS: GC or xpconnect or...?) to further investigate and/or address this?

Second, am I right in thinking that having docshell implement nsIClassInfo would "fix" this by removing the need for all the QueryInterface calls in the first place? If I'm right I can file a follow-up for that as I'm pretty sure it'd simplify a bunch of JS in the frontend anyway.

Flags: needinfo?(continuation)
Pushed by fqueze@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/ad2d1e7f9fff browser_pageinfo_separate_private.js - Query the nsILoadContext interface before accessing it on the docShell object, r=Gijs.
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 109 Branch

(In reply to :Gijs (he/him) from comment #15)

First, am I right in thinking that it's unexpected that previously-QI'd interfaces stop being exposed through xpconnect/wrappers/whatever-the-technical-term-is when a GC happens? If so can you perhaps file a follow-up bug in the right component (unsure - JS: GC or xpconnect or...?) to further investigate and/or address this?

In this case, it is a C++ object being exposed to JS, so it is an XPCWrappedNative (XPCWN). I think the specific data structure that is used to represent individual interfaces for an XPCWN is an XPCWrappedNativeTearOff. There's some discussion of tearoff lifetime in comments but generally I think we try to get rid of anything that isn't currently in use. Maybe that's not very important now that we have WebIDL bindings, which aren't on this system, but we do go to some effort to sweep these tearoffs. I think this would be an XPConnect issue not a GC issue.

Second, am I right in thinking that having docshell implement nsIClassInfo would "fix" this by removing the need for all the QueryInterface calls in the first place? If I'm right I can file a follow-up for that as I'm pretty sure it'd simplify a bunch of JS in the frontend anyway.

The comments in nsIClassInfoImpl.h seem to suggest that, but I don't know much about class info. Peter would be a better person to ask about class info and whether converting docshell over to use it would be a good idea or not.

Flags: needinfo?(continuation) → needinfo?(peterv)

(In reply to :Gijs (he/him) from comment #15)

First, am I right in thinking that it's unexpected that previously-QI'd interfaces stop being exposed through xpconnect/wrappers/whatever-the-technical-term-is when a GC happens? If so can you perhaps file a follow-up bug in the right component (unsure - JS: GC or xpconnect or...?) to further investigate and/or address this?

If the object loses all JS references and then a GC happens, I wouldn't be surprised if the full XPCWrappedNative is discarded, along with its tearoffs, meaning that when it's re-fetched, they're all generated again.

Second, am I right in thinking that having docshell implement nsIClassInfo would "fix" this by removing the need for all the QueryInterface calls in the first place? If I'm right I can file a follow-up for that as I'm pretty sure it'd simplify a bunch of JS in the frontend anyway.

Adding a nsIClassInfo implementation could mean that the QI wouldn't be necessary for the type to implement the XPCOM interfaces from JS, yes. We eagerly create the interface objects for types which declare their interfaces in the nsIClassInfo when wrapping them into JS, and I've definitely considered adding a classinfo for e.g. nsDocShell before (though I used to have some grand ideas about making a more lightweight classinfo-like thing for xpcom interfaces, and having everything implement it automatically through the queryinterface macros - that's never happened naturally).

(In reply to Nika Layzell [:nika] (ni? for response) from comment #19)

(In reply to :Gijs (he/him) from comment #15)

First, am I right in thinking that it's unexpected that previously-QI'd interfaces stop being exposed through xpconnect/wrappers/whatever-the-technical-term-is when a GC happens? If so can you perhaps file a follow-up bug in the right component (unsure - JS: GC or xpconnect or...?) to further investigate and/or address this?

If the object loses all JS references and then a GC happens, I wouldn't be surprised if the full XPCWrappedNative is discarded, along with its tearoffs, meaning that when it's re-fetched, they're all generated again.

What does "JS reference" mean here? If a system privileged window global is alive and still going to run script then docShell is reachable from that script, so I wouldn't have expected anything to get GC'd. If that doesn't count, and it only works if there's some JS-side direct ref to the docShell object (rather than via other webidl/xpcom methods), then I guess that could explain this... but that feels very impractical to deal with from a frontend code PoV. I'm not sure what guarantees we have about the timing of the GC (e.g. is there any guarantee it doesn't happen while JS is on the stack? What about in the presence of nested event loops...?) but that feels like it could cause lots of unexpected behaviour depending on the timing of the GC...

Flags: needinfo?(nika)

(In reply to :Gijs (he/him) from comment #20)

(In reply to Nika Layzell [:nika] (ni? for response) from comment #19)

If the object loses all JS references and then a GC happens, I wouldn't be surprised if the full XPCWrappedNative is discarded, along with its tearoffs, meaning that when it's re-fetched, they're all generated again.

What does "JS reference" mean here? If a system privileged window global is alive and still going to run script then docShell is reachable from that script, so I wouldn't have expected anything to get GC'd. If that doesn't count, and it only works if there's some JS-side direct ref to the docShell object (rather than via other webidl/xpcom methods), then I guess that could explain this... but that feels very impractical to deal with from a frontend code PoV.

XPConnect has always worked like this (for the last 20+ years). Getting a property through XPConnect or WebIDL bindings always calls a getter (there are some exceptions for WebIDL bindings, but we'll ignore those for this), we don't really store the return value for the property getter anywhere in the JS object that you got it from. XPCWrappedNatives are stored in a hashtable, when their JS object is collected by the GC they're removed from that table. When you call the getter for a property it'll return an existing XPCWrappedNative from the hashtable, or create a new one and store and return it. Because the JS object for the property itself isn't stored, as soon as the last reference from JS goes away it is subject to collection by the GC. So in this case, if you get window.docshell and store it in a variable in JS, as soon as that variable goes out of scope the JS object that you got from window.docshell will be subject to collection by the GC, potentially destroying the XPCWrappedNative. If you then get window.docshell again you'll get a different object in JS, and we'll create a new XPCWrappedNative.

XPCWrappedNatives reflect the canonical nsISupports of an object, and if the native object they reflect doesn't implement nsIClassInfo they also create what are called 'tearoffs' per non-nsISupports interface. This is because in that case we don't have an actual set of interfaces that the object implements, and we have to rely on QueryInterface to find out if an interface is implemented or not. So if you QI from JS we'll create a tearoff for that interface. These are also cached and subject to GC (and IIRC can be collected before the XPCWrappedNative if there are no references to them from JS), and they add new properties to the JS object (from the interface that the tearoff represents). If an XPCWrappedNative is destroyed all its tearoffs are destroyed too. This means that in the case above when you get a different object in JS for window.docshell it will have lost a number of properties until you QI it to the relevant interfaces again.

nsIClassInfo allows us to 'flatten' the interfaces that are returned from classinfo's interfaces property onto the XPCWrappedNative's JS object, which means that if you want to access any property from those interfaces they will be exposed through the main XPCWrappedNative without having to call QI on its JS object. This doesn't prevent the XPCWrappedNative from being collected, it just means that after the XPCWrappedNative is destroyed and a new one created all the properties from those interfaces will be present.

Flags: needinfo?(peterv)
Depends on: 1804665
No longer depends on: 1804665
See Also: → 1804665

(In reply to Peter Van der Beken [:peterv] from comment #21)

Thanks for the extensive explanation, that's super helpful!

So in this case, if you get window.docshell and store it in a variable in JS, as soon as that variable goes out of scope the JS object that you got from window.docshell will be subject to collection by the GC, potentially destroying the XPCWrappedNative. If you then get window.docshell again you'll get a different object in JS, and we'll create a new XPCWrappedNative.

OK. What guarantees are there about when GCs happen that have this result? E.g.:

window.docShell.QueryInterface(Ci.nsILoadContext);
let x = window.docShell.usePrivateBrowsing; // provided by nsILoadContext

Can a GC happen between these two lines? For the sync case above, I am not sure about the answer, but I suppose in:

async function() {
  ...
  window.docShell.QueryInterface(Ci.nsILoadContext);
  await somePromise;
  let x = window.docShell.usePrivateBrowsing; // provided by nsILoadContext
}

There are definitely no guarantees that there hasn't been a GC?

nsIClassInfo allows us to 'flatten' the interfaces that are returned from classinfo's interfaces property onto the XPCWrappedNative's JS object, which means that if you want to access any property from those interfaces they will be exposed through the main XPCWrappedNative without having to call QI on its JS object. This doesn't prevent the XPCWrappedNative from being collected, it just means that after the XPCWrappedNative is destroyed and a new one created all the properties from those interfaces will be present.

Right. I filed bug 1804665 to do this for docshell.

Flags: needinfo?(nika) → needinfo?(peterv)

(In reply to :Gijs (he/him) from comment #22)

OK. What guarantees are there about when GCs happen that have this result? E.g.:

window.docShell.QueryInterface(Ci.nsILoadContext);
let x = window.docShell.usePrivateBrowsing; // provided by nsILoadContext

Can a GC happen between these two lines?

Technically I think it could, but I don't think it would in practice.

async function() {
  ...
  window.docShell.QueryInterface(Ci.nsILoadContext);
  await somePromise;
  let x = window.docShell.usePrivateBrowsing; // provided by nsILoadContext
}

There are definitely no guarantees that there hasn't been a GC?

I don't think you can rely on that, no. To be really correct I think you need to do

let loadContext = window.docShell.QueryInterface(Ci.nsILoadContext);
…
let x = loadContext.usePrivateBrowsing;

Relying on the side-effect of QueryInterface in terms of exposed properties has always been fragile.

Note that these are some of the 'contracts' from XPCOM, whether it's exposed through C++ or XPConnect. In C++ you also need to QI to nsILoadContext every time to access usePrivateBrowsing if you only have an nsIDocShell, or keep a strong reference to the nsILoadContext pointer. I think there was some effort to make XPConnect at least somewhat JS-friendly, but given the nature of XPCOM (eg QI not being idempotent) there are limits to how friendly it can be. I'm not sure that the side-effect of QI on exposed properties was a great idea (though I do think that the evolution of our GC might have made things more complicated than at the start).

Flags: needinfo?(peterv)

As a minor addendum, besides spinning a nested event loop, allocating something in JS can cause the GC to run.

I failed to notice that the docshell was getting grabbed off of a window. Somehow, I thought that it was being held alive by JS directly, which is why I had to dig for the XPCWN tearoff sweeping. But yeah, as it is, the entire XPCWN could be collected.

See Also: → 1811476
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: