667011 - WeakMap values can be incorrectly classified as garbage by cycle collector

Assignee

Description

•

14 years ago

Igor pointed out that the patch in Bug 653248 will probably lead to the cycle collector sometimes classifying reachable JS objects as unreachable, which in turn will lead to the CC incorrectly freeing DOM objects that are reachable from JS side. I'm not sure what exactly happens, but I'd imagine that in the worst case it could make freed memory reachable from JS, potentially leading to crashes and other badness. The scenario he came up with is this: a WeakMap m is reachable from JS roots, and m contains a key k and value v which are reachable only from DOM. After the patch in Bug 653248, the GC will mark m black, and k and v grey. When the cycle collector runs, its graph will have an edge from m to v, but not from k to v (which is wrong). Because m is marked black, the CC will not traverse it. If v is part of what looks like a garbage cycle through DOM, then the CC will mark v as garbage, as it doesn't see that k is holding on to v. The DOM part of the cycle involving v will be freed, but this is still reachable from JS via m and k. With the previous behavior, the GC would mark the value black if either the map or the key is reachable in any way (ie are marked black or grey). This leaks, as it will never free cycles that pass through a WeakMap value, but is safe because it ensures that anything reachable from a reachable value is never freed. The short term solution is to back out the patch in Bug 653248. Longer term, the cycle collector needs a more accurate picture of the weak map edges. It seems likely this will require making the cycle collector actually aware of WeakMaps. The basic problem is that most edges act like an "or": if an object is reachable from any incoming edge, it is reachable. For a weak map, the map and key act more like "and" wrt the value: both map and key must be reachable for the value to be reachable via the weak map. I'll try to come up with a test for this.

Andreas Gal :gal

Updated

•

14 years ago

status-firefox5: --- → unaffected

status-firefox6: --- → affected

status-firefox7: --- → affected

Whiteboard: [sg:crit]

Jesse Ruderman

Updated

•

14 years ago

Whiteboard: [sg:crit] → [sg:critical]

Andrew McCreight [:mccr8]

Assignee

Comment 1

•

14 years ago

I believe that the behavior before Bug 653248 landed is safe (but leaky), so this should only affect Firefox 7. If I want to back out that patch, should I put up a patch in here to do that for review or what?

Assignee: nobody → continuation

Andreas Gal :gal

Updated

•

14 years ago

status-firefox6: affected → unaffected

Andreas Gal :gal

Comment 2

•

14 years ago

FF7 hasn't branched yet, so we should try to fix it for real (without a leak).

Andrew McCreight [:mccr8]

Assignee

Comment 3

•

14 years ago

Attached file test behavior of weak map and GC — Details

As part of looking at fixing the interaction of WeakMaps and the cycle collector, I've started working on some exhaustive tests. To start with, I have this test, that tests the interaction of WeakMaps with the JS GC. It tests out various combinations of the WeakMap and key being alive or dead, and sees if the value is alive or not. The value is reachable only via the WeakMap. The only time the value should be live is when both the WeakMap and the key are alive. I do this by using finalizeCount(). Of the four combinations, most work as expected, but oddly when the WeakMap is dead but the key is live the value is not being collected. It looks like this is because the WeakMap itself is not being collected. I don't know if this is a problem with the handling of WeakMaps or (more likely...) some kind of problem with my test, as this is basically the first JS test I have written. I'm not really sure offhand how this could happen, as I'd think that if the map is unreachable that the WeakMap code wouldn't even be invoked.

Andrew McCreight [:mccr8]

Assignee

Comment 4

•

14 years ago

Oh, and I meant to add, is there any way to examine the JS GC behavior in a script that also has DOM nodes around? I want to observe the behavior of WeakMaps when grey objects are involved (eg reachable from XPCOM). It looks like finalizeCount is only available in jstests... can those interact with the DOM? At least when being tested with the browser, of course. In the short term, I can dump cycle collector graphs and examine those to figure out what is happening, but I wouldn't want to use those for automated regression testing.

Andreas Gal :gal

Comment 5

•

14 years ago

I am guessing the stack scanner is keeping your weak map alive.

Andrew McCreight [:mccr8]

Assignee

Comment 6

•

14 years ago

I return from the function where the weak map is defined, and overwrite the value in the local before I leave in an attempt to combat this, but I guess junk could still be hanging around somewhere the stack scanner would find it? I could try adding a random recursive loop to overwrite the stack if that might help.

Andrew McCreight [:mccr8]

Assignee

Comment 7

•

14 years ago

Hmm. Does look like it is something along those lines. I get this when I run that same function over and over again: counts: 0 --> 0 counts: 0 --> 1 counts: 1 --> 2 counts: 2 --> 3 counts: 3 --> 4 counts: 4 --> 5 counts: 5 --> 6 counts: 6 --> 8 First time, the value is not freed, then 1 value is freed every call, then two values are freed at the end. (This is also with a function that recurses to a depth of 1000 in an apparently vain attempt to clear out the stack.) So I guess it works out at the end.

Andrew McCreight [:mccr8]

Assignee

Comment 8

•

14 years ago

Hmm. I think that contrary to what I said earlier the original behavior was not safe either. Consider the following slight variation on Igor's original scenario: map is marked black, key is marked grey. Value will be (incorrectly) marked black. Value connects to an object that is part of a cycle that passes through DOM, but otherwise has no references. As in Igor's scenario, the CC will not trace through the value, because it is marked. Thus, the CC thinks that the cycle the value connects to is garbage, and collects the DOM part of it, and we end up with the same sort of dangling pointer into DOM-land.

status-firefox6: unaffected → affected

Andrew McCreight [:mccr8]

Assignee

Comment 9

•

14 years ago

Wait, no, I am dumb. In that case, the JS GC will mark the cycle that the value is connected to black, so the CC won't free it. Sorry for the bugspam.

status-firefox6: affected → unaffected

Andrew McCreight [:mccr8]

Assignee

Comment 10

•

14 years ago

I'm going to back out Bug 653248 today to fix this until we can get the CC-WeakMap interaction fixed.

Peter Van der Beken [:peterv]

Comment 11

•

14 years ago

I don't see a solution for this without actually recording the edges m->v and k->v (which is why we needed bug 548733, no?).

Andrew McCreight [:mccr8]

Assignee

Comment 12

•

14 years ago

Adding m->v and k->v edges to the graph doesn't actually give us the right results. That will make v black if either m or k are black, whereas we need v to be black only if both m and k are black. Igor had an idea that sounds reasonable to me. First, don't add any edges to the CC graph for WeakMaps. Then, after ScanRoots() has run, iterate over all of the WeakMaps. If a WeakMap and a key were marked black, flood black on the value (if it isn't black already). Otherwise, flood white (if the value is grey). Like the GC's WeakMap algorithm, if you end up doing anything during a scan, you must rescan the WeakMap list, because you may have changed the color of a WeakMap. The only modification to his idea that I think we need is that we need to add all of the WeakMap values (and anything reachable from there) to the CC graph while we are building it, because objects may be reachable only via the WeakMap, and we don't want to have to add them later. We don't have to add the map or key, but when we're scanning the WeakMap list we need to handle the case where the key or value are not in the graph, which is the same as them being marked white. We could also add another optimization that we only add a WeakMap value if both the map and key are in the graph (after we've otherwise finished building the graph), but this will require the same sort of doing another pass over the WeakMaps if something is changed.

Andrew McCreight [:mccr8]

Assignee

Comment 13

•

14 years ago

Backout has landed in moz-central, so I'm marking Firefox 7 as unaffected.

status-firefox7: affected → unaffected

Peter Van der Beken [:peterv]

Comment 14

•

14 years ago

(In reply to comment #12) > The only modification to his idea that I think we need is that we need to > add all of the WeakMap values (and anything reachable from there) to the CC > graph while we are building it, because objects may be reachable only via > the WeakMap, and we don't want to have to add them later. Only if they're not black though, right?

Andrew McCreight [:mccr8]

Assignee

Comment 15

•

14 years ago

Do you mean, we don't want to add the value if it has been marked black by the JS GC? That's a good point.

Andrew McCreight [:mccr8]

Assignee

Comment 16

•

14 years ago

Also, we'll need to update the CC graph dumping code to do WeakMaps correctly.

Daniel Veditz [:dveditz]

Comment 17

•

14 years ago

The backout "fixes" this bug (on 7/central). When bug 653248 finally lands it should incorporate fixes required to make this bug not happen again.

Blocks: 653248

Status: NEW → RESOLVED

Closed: 14 years ago

status-firefox7: unaffected → fixed

Resolution: --- → FIXED

Andrew McCreight [:mccr8]

Assignee

Comment 18

•

14 years ago

I filed Bug 668855 for general discussion of making the cycle collector work with WeakMaps. This bug is mostly about how half a fix causes dangling pointers.

Jesse Ruderman

Updated

•

14 years ago

Group: core-security

Daniel Veditz [:dveditz]

Updated

•

14 years ago

Keywords: regression

Target Milestone: --- → mozilla7

Mihaela Velimiroviciu (:mihaelav)

Comment 19

•

14 years ago

Can anyone please provide a test case or with STR / guidelines that I can use to verify this fix? Do I need to create any specific environment to test the fix? Thanks!

Andrew McCreight [:mccr8]

Assignee

Comment 20

•

14 years ago

We never had a test case. This was more of a theoretical problem.

Daniel Veditz [:dveditz]

Updated

•

14 years ago

status1.9.2: --- → unaffected

u279076

Comment 21

•

14 years ago

qa- as no QA verification needed

Whiteboard: [sg:critical] → [sg:critical][qa-]