Open Bug 916748 Opened 11 years ago Updated 2 years ago

Report scope chain leaks.

Categories

(Core :: JavaScript Engine, defect)

defect

Tracking

()

People

(Reporter: nbp, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [MemShrink:P2])

I made an experiment[1] to report scope chain leaks which is due to the way we alias variable in the JavaScript engine. 

The prototype is a dirty hack which introduces new fields on the scope chain.  Each field is a Int32 which are reseted the first time the scope chain is marked.  These bit fields are used to mark the properties of the scope chain.  Every time we trace a function, we mark all the aliased var used by the function by flipping the corresponding flag in the bit mask.  Then, before sweeping we follow the same process to report any property which is holding memory when the last function holding the property is reclaimed.

Messages are reported if the leaked property is a value which is holding memory, such as an object / string / …

The prototype is not fully functional because at the moment the JSScript bindings do not include any of the variable aliased by the scope chain, but this should work fine with lazy functions.  One way to work around would be to iterate the script linearly to look for (GET/SET)ALIASEDVAR bytecodes.

Note that this would give different result than the dominator tree proposal, as this will report any scope chain object which is logically leaked, and not only big objects which are leaked.

[1] https://github.com/nbp/mozilla-central/compare/central...scope-chain-leak-reporter
This is interesting! Won't it report a lot of false positives, though?
In either case, it will be valuable to have some actual measurements of how common unused slots in "joined scopes" actually are.

Of course, we could also just use the script's data about which variables it (or its children) might use simply to do the marking in the first place.

It's worth keeping in mind that it's not possible to be perfect. Consider the following function:

function f(x) {
  return function g(y) { if (y) return x; else return 42; };
}

Now, if g is never called with a true value, it needn't retain x's value. But that's an undecidable property of the program, in general. So we will always be able to construct programs that retain things they shouldn't, no matter how clever we get.
nbp, what's the definition of a "scope chain leak"?
Whiteboard: [MemShrink] → [MemShrink:P2]
(In reply to Jim Blandy :jimb from comment #1)
> This is interesting! Won't it report a lot of false positives, though?

I don't think we would see any false positive, as the scope chain contains the minimal set of captured variables.
But we might have a lot of duplicates this way, as the current prototype is doing it per-scope chain.  So if we have lots of scope chain this will produce a message for all of them.

Also note that there is no point at reporting a property which is set to a number / null / undefined, as they are not holding more memory than their space on the scope chain.  Which means that there is an most of the time an easy way to fix these leaks by adding

  x = undefined;

in the right location.

(In reply to Jim Blandy :jimb from comment #2)
> It's worth keeping in mind that it's not possible to be perfect.

Indeed, we cannot capture all logical leaks, but reporting at least some of the category will raise the awareness of them.  To be honest I would not be surprised if we find a lot of these "leaks" in Gaia.

(In reply to Nicholas Nethercote [:njn] from comment #3)
> nbp, what's the definition of a "scope chain leak"?

By "scope chain leak", I mean that the scope chain is holding memory which cannot be used by any of the functions which are still alive.

See the example and the discussion on dev.gaia[1].

[1] https://mail.mozilla.org/pipermail/firefox-dev/2013-July/000552.html
In fact, what I made & describe in comment 0 is similar to what Luke suggested in Bug 894971 comment 2.
(In reply to Nicolas B. Pierron [:nbp] from comment #4)
> (In reply to Jim Blandy :jimb from comment #1)
> > This is interesting! Won't it report a lot of false positives, though?
> 
> I don't think we would see any false positive, as the scope chain contains
> the minimal set of captured variables.

"False positive" isn't the right term, but I'd worry about lots of noise. Holding onto small strings and small objects doesn't really matter, but might swamp the results. It'd be nice to be able to sort by total retained size for each one.

> (In reply to Jim Blandy :jimb from comment #2)
> > It's worth keeping in mind that it's not possible to be perfect.
> 
> Indeed, we cannot capture all logical leaks, but reporting at least some of
> the category will raise the awareness of them.  To be honest I would not be
> surprised if we find a lot of these "leaks" in Gaia.

Yeah, I wouldn't worry about the dynamically-unreachable ones (ie, where reachability is dependent on control flow.)

> 
> (In reply to Nicholas Nethercote [:njn] from comment #3)
> > nbp, what's the definition of a "scope chain leak"?
> 
> By "scope chain leak", I mean that the scope chain is holding memory which
> cannot be used by any of the functions which are still alive.

These are leaks due to closures keeping full scope objects alive, rather than just their captured variables, where the scope objects hold onto all variables captured by any closures. (At least, I *think* that's what we're talking about.)
(In reply to Steve Fink [:sfink] from comment #6)
> "False positive" isn't the right term, but I'd worry about lots of noise.
> Holding onto small strings and small objects doesn't really matter, but
> might swamp the results. It'd be nice to be able to sort by total retained
> size for each one.

The way I thought is that we can have some about:config pref for reporting these errors, so adding an extra pass to report the memory hold by each property sounds doable, in which case this would be a cheap implementation of the dominator tree.

I think that the work around is easy (x = null;) to put in place, that the size should not matter much and that people should seek to put the number of messages to 0, even if the object is tiny.
> > (In reply to Nicholas Nethercote [:njn] from comment #3)
> > > nbp, what's the definition of a "scope chain leak"?
> > 
> > By "scope chain leak", I mean that the scope chain is holding memory which
> > cannot be used by any of the functions which are still alive.
> 
> These are leaks due to closures keeping full scope objects alive, rather
> than just their captured variables, where the scope objects hold onto all
> variables captured by any closures. (At least, I *think* that's what we're
> talking about.)

This can also be something with 2 different callbacks capturing different set of variables.  When one callback is unregistered, we are leaking all the properties which are only used by this callback.
(In reply to Nicolas B. Pierron [:nbp] from comment #7)
> (In reply to Steve Fink [:sfink] from comment #6)
> > These are leaks due to closures keeping full scope objects alive, rather
> > than just their captured variables, where the scope objects hold onto all
> > variables captured by any closures. (At least, I *think* that's what we're
> > talking about.)
> 
> This can also be something with 2 different callbacks capturing different
> set of variables.  When one callback is unregistered, we are leaking all the
> properties which are only used by this callback.

Huh? Sorry, what's the difference? The callbacks are closures. AIUI, the reason why your example is problematic is because the surviving callback keeps the full scope object alive, and that scope object holds onto the variables captured by both callbacks.
Assignee: general → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.