Closed Bug 1510145 Opened 2 years ago Closed 2 years ago

Assertion failure: arena->bufferedCells()->isEmpty(), at js/src/gc/GC.cpp:2433 with recomputeWrappers

Categories

(Core :: JavaScript Engine, defect, P1)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla66
Tracking Status
firefox-esr60 --- unaffected
firefox63 --- wontfix
firefox64 --- wontfix
firefox65 + fixed
firefox66 + fixed

People

(Reporter: gkw, Assigned: jonco)

References

Details

(5 keywords, Whiteboard: [jsbugmon:][post-critsmash-triage][adv-main65+])

Attachments

(3 files, 1 obsolete file)

The following testcase crashes on mozilla-central revision a60b595747ad (build with --enable-debug, run with --fuzzing-safe --no-threads --no-baseline --no-ion --no-asmjs):

See attachment.

Backtrace:

#0  RelocateArena (arena=<optimized out>, sliceBudget=...) at js/src/gc/GC.cpp:2433
#1  js::gc::ArenaList::relocateArenas (this=0x7f5d6a652200, toRelocate=<optimized out>, relocated=0x0, sliceBudget=..., stats=...) at js/src/gc/GC.cpp:2489
#2  0x0000555c7cce8bf7 in js::gc::ArenaLists::relocateArenas (this=0x7f5d6a652098, relocatedListOut=@0x7fff0c9e6dc8: 0x0, reason=JS::gcreason::ALLOC_TRIGGER, sliceBudget=..., stats=...) at js/src/gc/GC.cpp:2573
#3  0x0000555c7cce9035 in js::gc::GCRuntime::relocateArenas (this=0x7f5d6a61c6e0, zone=0x7f5d6a652000, reason=JS::gcreason::ALLOC_TRIGGER, relocatedListOut=@0x7fff0c9e6dc8: 0x0, sliceBudget=...) at js/src/gc/GC.cpp:2592
#4  0x0000555c7cd06d0e in js::gc::GCRuntime::compactPhase (this=<optimized out>, reason=JS::gcreason::ALLOC_TRIGGER, sliceBudget=..., session=...) at js/src/gc/GC.cpp:7087
/snip

For detailed crash information, see attachment.

Unsure if this is s-s, as it seems to involve the shell function recomputeWrappers(). This is an intermittent testcase, you may have to change the path at the top of the testcase to your m-c repo.
autobisectjs shows this is probably related to the following changeset:

The first bad revision is:
changeset:   https://hg.mozilla.org/mozilla-central/rev/343c194c1a08
user:        Jan de Mooij
date:        Wed Sep 19 11:42:45 2018 +0200
summary:     Bug 1492406 - Add a recomputeWrappers function to the JS shell for js::RecomputeWrappers. r=jonco

Jan, is bug 1492406 a likely regressor?

(also, the more I reduce, the less reliable the testcase becomes, this is already reduced quite a bit)
Blocks: 1492406
Flags: needinfo?(jdemooij)
Summary: Assertion failure: arena->bufferedCells()->isEmpty(), at js/src/gc/GC.cpp:2433 involving recomputeWrappers() → Assertion failure: arena->bufferedCells()->isEmpty(), at js/src/gc/GC.cpp:2433 involving recomputeWrappers
Summary: Assertion failure: arena->bufferedCells()->isEmpty(), at js/src/gc/GC.cpp:2433 involving recomputeWrappers → Assertion failure: arena->bufferedCells()->isEmpty(), at js/src/gc/GC.cpp:2433 with recomputeWrappers
Hm we're relocating an arena that has a non-empty bufferedCells. I think the recomputeWrappers shell function is either innocent and without it the test just doesn't complete or it exposed a real GC-related bug.
Flags: needinfo?(jdemooij) → needinfo?(jcoppeard)
Assignee: nobody → jcoppeard
Flags: needinfo?(jcoppeard)
Attached patch bug1510145 (obsolete) — Splinter Review
This is due to the changes in bug 1407143.

It's possible for us to start an incremental GC slice in the Mark state with the nursery empty, but with entries in the whole cell store buffer.  (The reason for the latter is that JSObject::swap conservatively puts both argument objects into this store buffer in case either of them has children that are in the nursery without checking whether this is the case.)  Therefore the check at the end of the Mark state whether we need to return to gcCycle() to evict the nursery is not sufficient.

I've refactored gcCycle() to move the budgetIncrementalGC() call above the call to collect the nursery so that we can avoid the situation where we switch to a non-incremental GC after we decided not to collect the nursery.

This meant changing resetIncrementalGC() a little as that now gets called before evicting the nursery and without the AutoGCSession.
Attachment #9028303 - Flags: review?(pbone)
Depends on: 1407143
Comment on attachment 9028303 [details] [diff] [review]
bug1510145

Review of attachment 9028303 [details] [diff] [review]:
-----------------------------------------------------------------

My main concern is the splitting up of AutoGCSession's critical section.  Can you confirm that I've understood it correctly, that it won't be a problem?

The patch also refactors code a little more than is necessary to make the patch work.  I'd usually prefer that as seperate patches, but it's fairly trivial in this case and I'm probably just nit-picking.  Does our team have a policy about this?  This is probably just personal preference and I'm happy to accept the change in any case.

Thanks.

::: js/src/gc/GC.cpp
@@ +7207,5 @@
> +    }
> +
> +    minorGC(JS::gcreason::RESET, gcstats::PhaseKind::EVICT_NURSERY_FOR_MAJOR_GC);
> +
> +    AutoGCSession session(rt, JS::HeapState::MajorCollecting);

We can move this initialisation into just the places where it is needed (the calls to incrementalSlice()).  This'll avoid taking the lock when it's not required.

@@ +7313,5 @@
>          break;
>        }
> +
> +      default:
> +        MOZ_CRASH("Unexpected GC state");

Please use case State::NotActive here, then if we ever add new states we'll get a compiler error/warning for this code and know we need to revisit it.  Which will tell us sooner than a crash.

@@ +7461,5 @@
>      }
>  
>      switch (incrementalState) {
>        case State::NotActive:
> +        incMajorGcNumber();

nit: This seems like an unrelated change, I'd usually prefer a seperate patch.  I like this change though, it's neater.

@@ +7853,5 @@
> +    rt->mainContextFromOwnThread()->verifyIsSafeToGC();
> +
> +    // It's ok if threads other than the main thread have suppressGC set, as
> +    // they are operating on zones which will not be collected from here.
> +    MOZ_ASSERT(!rt->mainContextFromOwnThread()->suppressGC);

nit: This also seems like a kinda-related change.

@@ +7860,5 @@
>      AutoCallGCCallbacks callCallbacks(*this);
>  
>      gcstats::AutoGCSlice agc(stats(), scanZonesBeforeGC(), invocationKind, budget, reason);
>  
> +    auto result = budgetIncrementalGC(nonincrementalByAPI, reason, budget);

By moving this here, and it can call resetIncrementalGC which creates a GC session we've split a aut AutoGCSession into two seperate critical sections.  This was the main thing stopping me from writing this code like this in the first place, I was worried that that critical section might me protecting some things that needed to be atomic.  If you're confident that it doesn't than that's okay and we can land this.  I wasn't confident when I did Bug 1407143.

It looks like we only use the AutoGCSession to protect the calls to incrementalSlice() (within resetIncrmentalGC()): so think that when those execute we about this GC without taking this session, so it's probably okay.
Priority: -- → P1
Keywords: sec-high
Blocks: 1407143
No longer blocks: 1492406
No longer depends on: 1407143
(In reply to Paul Bone [:pbone] from comment #6)
> My main concern is the splitting up of AutoGCSession's critical section. 
> Can you confirm that I've understood it correctly, that it won't be a
> problem?

Yes, this is fine.  This is used to set the heap state to indicate that we're currently in the GC.  This isn't about protected from races with other threads.

> The patch also refactors code a little more than is necessary to make the
> patch work.  I'd usually prefer that as seperate patches, but it's fairly
> trivial in this case and I'm probably just nit-picking.  

Yes I did shuffle some things around.  Possibly I should have split that out.

> Does our team have a policy about this? 

I don't think there's an explicit policy, just to do whatever makes sense.

> We can move this initialisation into just the places where it is needed (the
> calls to incrementalSlice()).  This'll avoid taking the lock when it's not
> required.

This doesn't actually take a lock, it just sets the heap state.  I'd rather keep the current (fewer) uses if that's OK.

> Please use case State::NotActive here, then if we ever add new states we'll
> get a compiler error/warning for this code and know we need to revisit it. 

Ah yes, good point.  I'll do that.
Attached patch bug1510145 v2Splinter Review
Updated (and reformatted) patch.
Attachment #9028303 - Attachment is obsolete: true
Attachment #9028303 - Flags: review?(pbone)
Attachment #9029342 - Flags: review?(pbone)
On further thought I think it's possible to simplify the way resetIncrementalGC() works so that it doesn't call incrementalSlice() directly, but arranges for the following call in gcCycle to do the appropriate work (finishing the current GC).  collect would then call gcCycle again as it does now.  We would then only need to have one use of AutoGCSession.  I'll into this after this lands.
Blocks: 1512045
(In reply to Jon Coppeard (:jonco) from comment #7)
> (In reply to Paul Bone [:pbone] from comment #6)
> > My main concern is the splitting up of AutoGCSession's critical section. 
> > Can you confirm that I've understood it correctly, that it won't be a
> > problem?
> 
> Yes, this is fine.  This is used to set the heap state to indicate that
> we're currently in the GC.  This isn't about protected from races with other
> threads.

Okay.

> > The patch also refactors code a little more than is necessary to make the
> > patch work.  I'd usually prefer that as seperate patches, but it's fairly
> > trivial in this case and I'm probably just nit-picking.  
> 
> Yes I did sht's ouffle some things around.  Possibly I should have split that out.

it's fine in this case, I wouldn't worry about the effort for this time.

> > We can move this initialisation into just the places where it is needed (the
> > calls to incrementalSlice()).  This'll avoid taking the lock when it's not
> > required.
> 
> Ths doesn't actually take a lock, it just sets the heap state.  I'd rather
> keep the current (fewer) uses if that's OK.

In that case, I agree it's better to have less code.

r+
Comment on attachment 9029342 [details] [diff] [review]
bug1510145 v2

Review of attachment 9029342 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks Jon, all good.
Attachment #9029342 - Flags: review?(pbone) → review+
Comment on attachment 9029342 [details] [diff] [review]
bug1510145 v2

[Security Approval Request]

How easily could an exploit be constructed based on the patch?: Very difficult.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No

Which older supported branches are affected by this flaw?: 63 and onwards are affectet

If not all supported branches, which bug introduced the flaw?: Bug 1407143

Do you have backports for the affected branches?: No

If not, how different, hard to create, and risky will they be?: Same patch should apply, or should be trivial to rebase.

How likely is this patch to cause regressions; how much testing does it need?: Unlikely to cause regressions.
Attachment #9029342 - Flags: sec-approval?
Comment on attachment 9029342 [details] [diff] [review]
bug1510145 v2

[Triage Comment]
sec-approval=dveditz

We also want this on beta 65, a=dveditz
Attachment #9029342 - Flags: sec-approval?
Attachment #9029342 - Flags: sec-approval+
Attachment #9029342 - Flags: approval-mozilla-beta+
https://hg.mozilla.org/mozilla-central/rev/aba3963bc765
Group: javascript-core-security → core-security-release
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Flags: qe-verify-
Whiteboard: [jsbugmon:] → [jsbugmon:][post-critsmash-triage]
Whiteboard: [jsbugmon:][post-critsmash-triage] → [jsbugmon:][post-critsmash-triage][adv-main65+]
Blocks: 1523816
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.