Open Bug 894971 Opened 11 years ago Updated 2 years ago

Entrain fewer unnecessary variables in closures

Tracking

()

Status:

NEW

People

(Reporter: khuey, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [MemShrink:P2])

Attachments

(1 file)

Pseudocode for marking only the used slots of a CallObject 11 years ago Steve Fink [:sfink] [:s:] 6.62 KB, patch		Details \| Diff \| Splinter Review

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Reporter

Description

•

11 years ago

In a setup such as:

function outer() {
  var v_1;
  function f_1() {
    doSomething(v_1);
  }
  var v_2;
  function f_2() {
    doSomething(v_2);
  }
  ...
  var v_N;
  function f_N() {
    doSomething(v_N);
  }
}

Let C_n be the set of variables which f_n closes over.  Currently Spidermonkey creates a single call object which every f_n shares, and that call object references the union of each C_n for every n.  This means that keeping a single f_n alive will entrain v_1, v_2, ... v_N.

I'm led to believe that the gold standard in this area is something called "safe-for-space" closures which ensure that a function only entrains the set of variables it absolutely must, but it's possible to imagine less complex methods.  For instance, we could create separate call objects only for functions f_m where C_m is disjoint from the union of all C_n n != m.  That would fix the (admittedly contrived) example above as well as bug 894135.

Justin Lebar (not reading bugmail)

Updated

•

11 years ago

Whiteboard: [MemShrink]

Steve Fink [:sfink] [:s:]

Comment 1

•

11 years ago

Attached patch Pseudocode for marking only the used slots of a CallObject — Details — Splinter Review

Ok, I'm curious. It seems like we know what variables are captured, so I don't understand why we can't do the optimal thing and only hang onto the stuff that's accessible.

I'm attaching a "patch" with pseudocode for the obvious thing: don't mark anything in a CallObject unless it's used by something live. You'll see from the patch that I don't understand the data structures involved.

Anyway, I'm sure there's a fatal flaw here where we're not keeping the necessary information for some optimizationy reason, so let this serve as me asking what that reason is.

Steve Fink [:sfink] [:s:]

Updated

•

11 years ago

Attachment #778693 - Flags: feedback?(luke)

Luke Wagner [:luke]

Comment 2

•

11 years ago

Comment on attachment 778693 [details] [diff] [review]
Pseudocode for marking only the used slots of a CallObject

That's certainly an interesting idea.

It's a bit more complicated than iterating through bindingArray, though, since that is just the list of bindings for the containing JSScript.  Also, I think you'd have to do the markVarUsed starting from a JSFunction so that way you could find the right fun->environment() scope chain to mark.

On first consideration, it seems like what you'd need to need is for each JSScript to additionally record its free variables.  You wouldn't want to store the list of *names* of free variables, though, since that would involve hash-table lookups to find slots, but rather you'd want to store, for each free variable, the 32-bit ScopeCoordinate.  With this strategy, I worry about (1) the extra memory to store these free variables, (2) the time spent traversing the scope chain to mark each free variable.  In theory, you could sort the free variables so that you could do all the marking in a single linear pass over the scope chain, but it still seems like this could lead to much higher mark times for code with many live scopes/closures.  Going further, each CallObject could have a bitmask with a bit per nested function that has already marked this CallObject.  That way, each call object would get marked at most K times where K is the (static) number of nested scripts.

Also, the "what about dead variable slots" question you raised is a good one: if the Debugger jumps in and asks to see random variables, we could possibly observe these garbage values.

So, overall, this approach seems feasible (and much easier than safe-for-space) but we'd have to be careful to measure that we didn't regress ordinary closure mark speed or increase memory usage too much.  It'd also be interesting to have a dynamic measurement to see how often variables are unnecessarily entrained in the manner that is preventable in this manner (I was saying earlier that such an analysis would be useful for bug finding anyway).

Attachment #778693 - Flags: feedback?(luke)

Nicholas Nethercote [inactive]

Updated

•

11 years ago

Whiteboard: [MemShrink] → [MemShrink:P1]

David Bruant

Updated

•

11 years ago

Blocks: 896088

Nicholas Nethercote [inactive]

Updated

•

10 years ago

Whiteboard: [MemShrink:P1] → [MemShrink]

Nicholas Nethercote [inactive]

Updated

•

10 years ago

Whiteboard: [MemShrink] → [MemShrink:P2]

David Teller [:Yoric] - still alive but not very active

Comment 3

•

10 years ago

I have just witnessed an interesting issue that was apparently caused by this bug.

function() {
  let foo = someComponent();
  let cb = function() { dump("Callback called\n"); };
  foo.init(cb);

  // No cycle in the code, but `foo` keeps `cb` alive and `cb` keeps `foo` alive.
}

In the sample, this caused process freezes because garbage-collection of `foo` should have stopped a thread.

Luke Wagner [:luke]

Comment 4

•

10 years ago

The cycle collector should be able to collect such cycles.  Last time I saw something that looked like this, it ended up being a general bug in the platform (bug 731868 comment 35) that likely was causing other leaks (bug 757749).

Andrew McCreight [:mccr8]

Comment 5

•

10 years ago

Yeah, please file a bug on that and CC me, and include the actual component you are using.  The first step would be to look at a CC log.

Steve Fink [:sfink] [:s:]

Comment 6

•

10 years ago

(In reply to David Rajchenbach Teller [:Yoric] from comment #3)
> I have just witnessed an interesting issue that was apparently caused by
> this bug.
> 
> function() {
>   let foo = someComponent();
>   let cb = function() { dump("Callback called\n"); };
>   foo.init(cb);
> 
>   // No cycle in the code, but `foo` keeps `cb` alive and `cb` keeps `foo`
> alive.
> }
> 
> In the sample, this caused process freezes because garbage-collection of
> `foo` should have stopped a thread.

For my own edification, (1) why does cb keep foo alive? I didn't think we entrained variables that were never captured by any closure. (Unless the example is incomplete, and there's a dummy closure not shown that captures foo.) And (2) are we really using GC to decide when to stop threads? That seems dangerous, as with anything that uses GC to manage other types of resources.

Nobody; OK to take it and work on it

Assignee

Updated

•

10 years ago

Assignee: general → nobody

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Entrain fewer unnecessary variables in closures

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: khuey, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [MemShrink:P2])

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Updated

Comment 1

Updated

Comment 2

Updated

Updated

Updated

Updated

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Updated

Attachment

General

Description

File Name

Content Type