Open Bug 1622718 Opened 4 years ago Updated 4 years ago

A loader that require()s a non-(pre)loaded script will leak its global (in XPCShell tests at least)

Categories

(DevTools :: Framework, defect, P3)

defect

Tracking

(Not tracked)

REOPENED

People

(Reporter: loganfsmyth, Unassigned)

References

Details

Attachments

(1 file, 2 obsolete files)

Attached file test_loader_destroy.js

I came across this issue while writing a test for https://bugzilla.mozilla.org/show_bug.cgi?id=1609201 though I'm not sure if it's causing any leaks that we've noticed in our codebase itself or the ones encountered in that issue. I also can't quite tell if this cache behavior is limited to files used during startup and if so, how that influences the normal devtools behavior vs the behavior during xpcshell tests.

Here's the problem:
https://searchfox.org/mozilla-central/rev/3f8c67d7fd836d559491e3fe497bc739f707c1a6/js/xpconnect/loader/mozJSSubScriptLoader.cpp#216-223

Since our loaders use loadSubscript in the context of a brand new global for each loader, whenever a resource:// file is loaded for the first time, whichever loader happens to do so will leak its global, and due to our lazyGlobal methods circling back to the loader itself to lazy-load a module, the global also keeps the loader alive.

I've attached a test that demonstrates the behavior. If you force-load the script in the global loader first, the test will pass, but if you load it only in the new non-global loader, the non-global loader will never be GCed.

If this is actually something that can affect non-test cases, seems like we'd potentially want to explore exposing the useCompilationScope flag used within the loader as an option on loadSubscriptWithOptions?

Assignee: nobody → loganfsmyth
Status: NEW → ASSIGNED

Could it be platform specific?
I tried running the xpcshell test and it always passes.
I tried commenting/uncommenting mod.require("devtools/shared/extend");, but it makes no difference.
Could it be that you would only need more forceGC/forceCC in the test because of the generational GC?

Otherwise, would it be easier to using the subscript loader option instead of exposing the compilation scope?
https://searchfox.org/mozilla-central/source/js/xpconnect/idl/mozIJSSubScriptLoader.idl#42

You may try using a browser-mochitest if you want to ensure this isn't specific to xpcshell.
While xpcshell is really special and can be significantly different from Firefox, browser-mochitest are really close to production codepaths.

Could it be platform specific?

Well that's frustrating, I'm on OSX, what are you on? It consistently reproduces for me.

Could it be that you would only need more forceGC/forceCC in the test because of the generational GC?

The loader explicitly holds a strong reference, I dumped the GC graph and the preloader explicitly has a GC root that holds the global live.

Otherwise, would it be easier to using the subscript loader option instead of exposing the compilation scope?

I considered it, but we really don't want to skip the cache since it could hru performance in cases where we start multiple servers in one process, it's just that we don't want the cached value to trigger leaking.

(In reply to Logan Smyth [:loganfsmyth] from comment #3)

Well that's frustrating, I'm on OSX, what are you on? It consistently reproduces for me.

I'm on linux.

Otherwise, would it be easier to using the subscript loader option instead of exposing the compilation scope?

I considered it, but we really don't want to skip the cache since it could hru performance in cases where we start multiple servers in one process, it's just that we don't want the cached value to trigger leaking.

The cache story is unclear. We think we do use it and benefit from it, but I'm not sure anyone asserted the actual behavior.
The current discussions around migrating to ES Modules were an attempt to clarify this situation and clean things up.
As you can see by reading the caching code in the subscript loader, it is pretty brittle.
Otherwise, spawing multiple servers in one process should, in theory, be restricted to the Browser Toolbox, which isn't the most important usecase to optimize. It happens that the current service worker implementation does that for the regular web toolboxes, but it is justified. Again, I would rather address this design flaw rather than optimize a pattern which we know we want to get rid of.

(In reply to Alexandre Poirot [:ochameau] from comment #4)

(In reply to Logan Smyth [:loganfsmyth] from comment #3)

Well that's frustrating, I'm on OSX, what are you on? It consistently reproduces for me.

I'm on linux.

And optimized artifact builds. In case this could be debug versus optimized or artifact versus local build...

I found why I got distinct results from you.
I didn't realized you were modifying an existing test.

Instead, to test this, I replaced devtools/server/tests/xpcshell/test_add_actors.js content with attachment 9133493 [details].
But... there is an head file in this folder, which most likely force loading the extend module!

I do reproduce the same behavior when using devtools/shared/tests/xpcshell/test_eventemitter_destroy.js.
It looks like, there is no difference between requiring extend (attachment 9133493 [details]) and event-emitter (test_eventemitter_destroy.js)?

Sorry for the confusion. Now, I'll try to follow your patch :)
Note that you should also get a review from somehow familiar with the loadSubScript implementation.

It looks like, there is no difference between requiring extend and event-emitter?

Yep, test_eventemitter_destroy is what motivated filing the bug, but the patch had not landed yet when I filed this so I attached a separate test. I didn't think about the effect of potential head.js changes though, woops!

Note that you should also get a review from somehow familiar with the loadSubScript implementation.

I wanted to make sure you thought this was worth pursuing first, but yep I totally agree.

Attachment #9133802 - Attachment description: Bug 1622718 - Avoid leaking devtools loaders except when opt-in. r=ochameau! → Bug 1622718 - Part 2: Avoid leaking devtools loaders except when opt-in. r=ochameau!

The priority flag is not set for this bug.
:ochameau, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(poirot.alex)
Pushed by loganfsmyth@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/016b86b83932
Part 1: Expose 'useCompilationScope' as an option from nsIJSSubScriptLoader. r=mccr8
https://hg.mozilla.org/integration/autoland/rev/eb632d0b3a7b
Part 2: Avoid leaking devtools loaders except when opt-in. r=ochameau
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 76
Regressions: 1626727

Backed out 2 changesets (Bug 1622718) as requested by dev for causing Bug 1626727.

Backout link: https://hg.mozilla.org/integration/autoland/rev/7f069649d6cf0d7d8cba14cfd987e835509b22fa

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: Firefox 76 → ---

This ended up triggering an edge case that I absolutely didn't think about, which is what caused the regression seen in https://bugzilla.mozilla.org/show_bug.cgi?id=1626727.

The issue is that useCompilationScope does its job a little to well. It compiles its script in the compilation scope and then copies it into the target scope, but this has the effect of calling DebugAPI::onNewScript twice, once for the compilation-scope script, and once for the loader-cloned version, where before it would only have tried to dispatch once, for the loader-cloned version. This means that, if the debugger server trying to load the file has the compilation scope as a debuggee, like the Browser Toolbox, we end up in a situation where every script loaded into the debugger server also ends up notifying the server itself about the compilation-scoped script, and since there is logic in the onNewScript function that triggers a lazyRequireGetter (and this trying to load another file with another set of DebugAPI::onNewScript calls), we end up with an exception due to onNewScript being re-entrant, causing things to not yet have loaded by the time we try to use them.

This does seem like surprising behavior for the script cache, that seems less than ideal, and it is possible for it to be tweaked so that it would only notify the debugger for the script that is actually going to be executed, but that's really more than I want to bite off right now. This bug was really meant to be a low-priority fix for a small bug, and this definitely expands the scope so for now I'll probably just leave things as they were.

Attachment #9133802 - Attachment is obsolete: true
Attachment #9136216 - Attachment is obsolete: true
Flags: needinfo?(poirot.alex)

Because this bug's Severity has not been changed from the default since it was filed, and it's Priority is P3 (Backlog,) indicating it has been triaged, the bug's Severity is being updated to S3 (normal.)

Assignee: loganfsmyth → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: