Closed Bug 807205 Opened 12 years ago Closed 11 years ago

Ability to import JS modules into a shared compartment

Categories

(Core :: XPConnect, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: gps, Unassigned)

References

Details

(Whiteboard: [MemShrink])

Problem statement: compartments per global (per imported module) isn't a great model when you have lots of cross-global/module communication. e.g. in the case of Sync, we pass large strings across many modules, incurring a memory hit in each compartment because we don't atomize strings. See bug 806087 and bug 778993 comment 11. Large, modern JS applications often consist of many modules/globals acting as one large functional unit (Firefox Sync is currently ~30 modules/JS files). Splitting them into multiple compartments has proved to be suboptimal.

From the perspective of someone who writes large JS features in Firefox, I want to be able to write many small, loosely-coupled but high cohesion JS modules/files without the resource and performance penalties of compartment per global. Reducing memory size of compartments doesn't really help because it doesn't deal with cross-compartment "leaking." If there were a way for JS modules to share compartments, application developers could make intelligent choices about consolidating high cohesion modules.

After talking with khuey about the problem and possible workarounds, we arrived at a probable solution: a variation of Cu.import() that imports into the current compartment.

For features that wished for most of its modules to live in one compartment (like Sync or any other large add-on or service), it could Cu.importShared() (or whatever you want to name it) modules into the current compartment. This would have similar behavior of Cu.import() - mainly making EXPORTED_SYMBOLS available.

The devil is in the details. What happens when you have a module that is loaded via both Cu.import() and Cu.importShared()? Does the first load determine the compartment? Do you allow the module to be loaded into multiple compartments? What happens when you Cu.import() from within a file being loaded via Cu.importShared()? Does the inner Cu.import() go to a new compartment (current behavior) or is importShared() greedy and pull all imports into the current compartment? I don't have enough knowledge to answer these questions.

Alternative solutions include:

* Have developers use loadScript() manually to import a file into the current global/compartment. You get ultimate control but the barrier to entry seems high and error prone. I'd rather see an official API that does this properly (what I'm asking for in this bug).

* Add some mechanism to allow a file/module to declare its compartment. e.g. named compartments. Related modules (like Sync) could say "load me into the 'sync' component." In the absence of this explicit instruction, a loaded module would be imported into a new global (like today - except on B2G apparently). I'm not sure if you could do this with a special variable or preprocessor syntax or what.

Anyway, if I missed something important or said something wrong, I'm sure khuey will correct me. I can only retain a small fraction of the knowledge he utters.
Whiteboard: [MemShrink]
Without going so far as to stomp another component's priority fields, I'd like to throw in my 2¢ requesting urgency here.

It seems (from my perspective, at least) that assorted work is being done around the tree -- Bug 778993 being one example, the original compartments landing itself being another, and quite probably more besides -- that tickles this issue and causes awful side-effects in certain situations.

We simply don't have enough automation in place to warn us of regressions in Sync's memory usage, and it was only blind luck (getting a reporter in Bug 806087 who could find a narrow regression range for me to investigate) that prevented an OOM crasher from being released in Fx17. (That lack of tooling is a problem itself, of course, but it won't resolve this known mismatch, and thus would just lead to us blocking more improvements like Bug 778993's.)

So: my fingers are firmly crossed that perfect isn't the enemy of good here, and we can get something imperfect landed sooner rather than later.

CCing :billm, in case he wants to wait for this before relanding Bug 778993.
I just want to be clear here that "loading into the same compartment" means "sharing a global". There's no way to get around that.

(In reply to Gregory Szorc [:gps] from comment #0)

> * Add some mechanism to allow a file/module to declare its compartment. e.g.
> named compartments. Related modules (like Sync) could say "load me into the
> 'sync' component." In the absence of this explicit instruction, a loaded
> module would be imported into a new global (like today - except on B2G
> apparently). I'm not sure if you could do this with a special variable or
> preprocessor syntax or what.

IMO, this is _definitely_ the way to go. I don't think it should be up to the importer to declare that the compartment is shared, but that the implementation should explicitly declare that this is ok. This also solves the ordering issue (what happens if somebody else calls import() on both modules before the call to importShared()), which I think is a pretty serious hurdle otherwise.
Additional links:

  Bug 756549 — the original "zomg CPG memory boom" bug
  Bug 759585 — zones
(In reply to Bobby Holley (:bholley) from comment #2)

So if we want files/modules/jsms themselves declare the named compartment they should be imported into, how do we do that?

I'd argue we want the compartment to be defined inside the file so we don't have to involve a separate system. But, that seems to introduce a chicken and egg problem: you need to parse the file to extract what compartment to import it into. Are there parts of the JS file we have the ability to parse without incurring a full script load? Would this require a new feature from the JS engine? Something along the lines of "use strict" perhaps?
(In reply to Gregory Szorc [:gps] from comment #4)
> (In reply to Bobby Holley (:bholley) from comment #2)
> 
> So if we want files/modules/jsms themselves declare the named compartment
> they should be imported into, how do we do that?
> 
> I'd argue we want the compartment to be defined inside the file so we don't
> have to involve a separate system. But, that seems to introduce a chicken
> and egg problem: you need to parse the file to extract what compartment to
> import it into. Are there parts of the JS file we have the ability to parse
> without incurring a full script load? Would this require a new feature from
> the JS engine? Something along the lines of "use strict" perhaps?

A dumb/easy solution here would just be to run the script once, check for the magic symbol, and if it's there, throw it away and import it into the target compartment.

Another solution would be to require that the annotation be at the beginning of the file, and just parse it manually from the ASCII stream before even feeding anything to the JS engine.
Whiteboard: [MemShrink] → [MemShrink:P3]
Would it be possible to extend the resource alias (or add a new entry) in the component manifest? Presumably that gets processed and loaded prior to any code in the module.

Another approach would be to use the filesystem (e.g., sentinel file in the code dir).

I think it's acceptable to constrain this sharing ability to shipped, on-disk, manifested, chrome code, right?
Whiteboard: [MemShrink:P3] → [MemShrink]
(In reply to Richard Newman [:rnewman] from comment #6)
> I think it's acceptable to constrain this sharing ability to shipped,
> on-disk, manifested, chrome code, right?

We also have to consider the Jetpack use case.
(In reply to Gregory Szorc [:gps] from comment #7)

> We also have to consider the Jetpack use case.

Do they have to be the same technique?

If we can do something easy that covers the current big pain points, and later something a little more thorough to cover more things, then I'd take that tradeoff.

I'm not familiar with how Jetpack factors in here -- does Jetpack dynamically load JSMs without first parsing some kind of manifest that describes how to handle the code it's about to load?
(In reply to Richard Newman [:rnewman] from comment #8)
> (In reply to Gregory Szorc [:gps] from comment #7)
> 
> > We also have to consider the Jetpack use case.
> 
> Do they have to be the same technique?
> 
> If we can do something easy that covers the current big pain points, and
> later something a little more thorough to cover more things, then I'd take
> that tradeoff.
> 
> I'm not familiar with how Jetpack factors in here -- does Jetpack
> dynamically load JSMs without first parsing some kind of manifest that
> describes how to handle the code it's about to load?

Jetpack creates a lot of sandboxes. One per module per add-on. This is to keep modules isolated for each other. At first we got a lot of push-back on this because it was costing a lot in memory so we spent a lot of time implementing support for creating multiple sandboxes in the same compartment (bug 677294). Unfortunately when CPG landed it broke all this and now we're back to pretty much square one.
I wonder if we should wontfix this in favour of zones (bug 759585).
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.