Switch cx->enumerators from a stack to a weak list

RESOLVED FIXED in Firefox 18

Status

()

Core
JavaScript Engine
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: dvander, Assigned: dvander)

Tracking

(Depends on: 1 bug)

unspecified
mozilla21
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox18+ fixed, firefox19+ fixed, firefox20+ fixed, firefox21 fixed)

Details

Attachments

(1 attachment)

(Assignee)

Description

4 years ago
cx->enumerators is a balanced stack of active iterator objects. In practice, it is really difficult to keep it correctly balanced, because the JS engine has so many different ways of leaving execution modes, and the try-note-iter mechanism itself is fairly delicate.

Since the enumerator list is unrooted, when it becomes unbalanced, its items can become garbage collected and later crash. This is likely behind topcrash bug 806820.

Instead, cx->enumerators should just be a doubly-linked list. Adding and removing items would still be fast, but we wouldn't require any LIFO ordering. The GC could then automatically remove entries which were not live.

Comment 1

4 years ago
Tracking for FF19 given this may be a fix we uplift to resolve bug 806820.
tracking-firefox19: --- → +
(Assignee)

Comment 2

4 years ago
Created attachment 703713 [details] [diff] [review]
fix

Passes jit-tests, sending to try. The only concern I have is around generators, which are only used in chrome code. Generators had their own enumerator list, and it was swapped with cx->enumerators when the generator ran. That means deleted properties could have shown up in a resumed generator, if the property was deleted when the generator was yielded. Now, the deleted property will be suppressed.

It's not the semantic change that worries me too much, but that if a generator yields in a for-in loop, then never resumes, it'll never remove its entry off compartment->enumerators. So that could leave extra stuff dangling until the next GC, but it should be rare.
(Assignee)

Comment 3

4 years ago
Comment on attachment 703713 [details] [diff] [review]
fix

https://tbpl.mozilla.org/?tree=Try&rev=7172debda82e
Attachment #703713 - Flags: review?(wmccloskey)
Comment on attachment 703713 [details] [diff] [review]
fix

Review of attachment 703713 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsiter.cpp
@@ +1102,5 @@
>                      /*
> +                     * Root the iterobj. This loop can GC, so we want to make sure that if
> +                     * the GC removes any elements from the list, it won't remove this one.
> +                     */
> +                    RootedObject iterobj(cx, ni->iterObj());

I think it would be better to make this an AutoObjectRooter for now.

@@ +1103,5 @@
> +                     * Root the iterobj. This loop can GC, so we want to make sure that if
> +                     * the GC removes any elements from the list, it won't remove this one.
> +                     */
> +                    RootedObject iterobj(cx, ni->iterObj());
> +                    (void)iterobj;

In that case, this line should be unnecessary, I think.

::: js/src/jsiter.h
@@ +30,5 @@
>  
>  struct NativeIterator
>  {
> +    HeapPtrObject obj;                  // Object being iterated.
> +    HeapPtrObject iterObj_;             // Internal iterator object.

We never mark iterObj_, so I think it should just be a raw pointer.
Attachment #703713 - Flags: review?(wmccloskey) → review+

Comment 5

4 years ago
dvander, could we land this on trunk? Given that bug 806820 is still a notable concern even on 18, I'd like to verify on trunk that this fix helps it, and then see this uplifted to aurora and beta - there's even still the possibility out there that we might think about a 18.0.2 just for this one.

Comment 6

4 years ago
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #5)
> dvander, could we land this on trunk? Given that bug 806820 is still a
> notable concern even on 18, I'd like to verify on trunk that this fix helps
> it, and then see this uplifted to aurora and beta - there's even still the
> possibility out there that we might think about a 18.0.2 just for this one.

Yep, let's get this onto Nightly/Aurora this week, and land no later than next Tuesday (in preparation for b4).
Blocks: 830654
David's try run looked good, so I pushed this. And I just realized that the author isn't set properly. Whoops.

https://hg.mozilla.org/integration/mozilla-inbound/rev/09ab58c800a1
https://hg.mozilla.org/mozilla-central/rev/09ab58c800a1
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21

Comment 9

4 years ago
Great! There are no crashes in 21.0a1/20130125 and above.
It should be uplifted to Aurora and Beta.
status-firefox21: --- → fixed

Comment 10

4 years ago
and maybe a 18.0.2 considering its very high volume.
tracking-firefox18: --- → ?

Updated

4 years ago
tracking-firefox20: --- → ?
OS: Linux → All
Hardware: x86_64 → All

Updated

4 years ago
status-firefox18: --- → affected
status-firefox19: --- → affected
status-firefox20: --- → affected
tracking-firefox20: ? → +
Added Marcia as she was one of the few ones who experienced the related Facebook problem.
(Assignee)

Comment 12

4 years ago
Comment on attachment 703713 [details] [diff] [review]
fix

[Approval Request Comment]
Bug caused by (feature/regressing bug #): unknown
User impact if declined: random crashes with slow script dialog
Testing completed (on m-c, etc.): yes
Risk to taking this patch (and alternatives if risky): details in comment #2, but we haven't seen fallout so I'm no longer worried
String or UUID changes made by this patch:
Attachment #703713 - Flags: approval-mozilla-beta?
Attachment #703713 - Flags: approval-mozilla-aurora?
Comment on attachment 703713 [details] [diff] [review]
fix

Approving on Aurora/Beta as this crash is very high across all the channels. Lets hope the crash volume goes down with this and we get more testing .

Please land no later than EOD today/early tomorrow morning on beta to make sure this gets into 19.0b4 going to build tomorrow.Thank you.
Attachment #703713 - Flags: approval-mozilla-beta?
Attachment #703713 - Flags: approval-mozilla-beta+
Attachment #703713 - Flags: approval-mozilla-aurora?
Attachment #703713 - Flags: approval-mozilla-aurora+
(Assignee)

Comment 14

4 years ago
https://hg.mozilla.org/releases/mozilla-beta/rev/12cc07fac263

Updated

4 years ago
status-firefox19: affected → fixed
(Assignee)

Comment 15

4 years ago
https://hg.mozilla.org/releases/mozilla-aurora/rev/7b8f0863a844

Updated

4 years ago
status-firefox20: affected → fixed

Updated

4 years ago
tracking-firefox18: ? → +
(Assignee)

Comment 16

4 years ago
https://hg.mozilla.org/releases/mozilla-release/rev/92d3a5aa0c7d

Updated

4 years ago
status-firefox18: affected → fixed

Comment 17

4 years ago
As the original author of this code, I am wondering whether this fix really is a fix, or whether its papering over the real problem: a missing CloseIterator somewhere. With this change, along that path we won't mark the iterator as inactive, and thus we won't be able to reuse it any more, and we are potentially accumulating a long list of pretty expensive iterator allocations. David, what do you think?
:dvander,

There were no reliable STR in bug 806820(fixed by patch in this bug), so its hard for QA to verify anything other than looking at the crash-stats for volume going down.

Was wondering if you could please help QA with some hints regarding what area should they be checking for any possible regressions or any other pointers you may have for  testing/verifying this patch ?
(Assignee)

Comment 19

4 years ago
(In reply to Andreas Gal :gal from comment #17)
> As the original author of this code, I am wondering whether this fix really
> is a fix, or whether its papering over the real problem: a missing
> CloseIterator somewhere.

Yes, we're papering over a problem, but only in the sense that the problem was unnecessary. Given that (1) we had no reasonable leads (2) the crash was pervasive and (3) the invariants required were subtle and complicated with 3 execution modes, it made more sense to completely eliminate the crash risk.

> With this change, along that path we won't mark the
> iterator as inactive, and thus we won't be able to reuse it any more, and we
> are potentially accumulating a long list of pretty expensive iterator
> allocations. David, what do you think?

The long list was indeed a concern, but the underlying bug is rare, and the list should get wiped out by GC. We are still marking the iterators as inactive, and not doing so is still a bug, but only a performance one rather than a GC violation.
(Assignee)

Comment 20

4 years ago
(In reply to bhavana bajaj [:bajaj] from comment #18)
> :dvander,
> 
> There were no reliable STR in bug 806820(fixed by patch in this bug), so its
> hard for QA to verify anything other than looking at the crash-stats for
> volume going down.

That's probably our best bet, lacking STR. The goal of this bug was to eliminate the possibility of the crash ever happening, so if the crash reports stop, it's probably fixed.

> Was wondering if you could please help QA with some hints regarding what
> area should they be checking for any possible regressions or any other
> pointers you may have for  testing/verifying this patch ?

Jesse and Bill had a test case that would crash (involving the slow script dialog), but Bill said it only repro'd once and he estimated the reproducibility at 1/100.

Comment 21

4 years ago
I am ok with this approach, but we should have a follow-up bug on file for the actual bug that we still need to fix (CloseIterator missing). David, what do you think?

Comment 22

4 years ago
dvander?
(Assignee)

Comment 23

4 years ago
(In reply to Andreas Gal :gal from comment #21)
> I am ok with this approach, but we should have a follow-up bug on file for
> the actual bug that we still need to fix (CloseIterator missing). David,
> what do you think?

Sure, we could, but I don't see it being actively investigated as (1) it would be very low impact/priority (2) we never had any actionable leads in the first place.

I did take out the assert that would catch an unbalanced enumerator stack. I'll add that assert back in, to make sure we don't accidentally regress the state of things.

Updated

4 years ago
Depends on: 838835
(Assignee)

Comment 24

4 years ago
filed bug 839726 to get rid of SuppressDeletedProperty and compartment->enumerators, that's probably the best path forward
You need to log in before you can comment on or make changes to this bug.