Closed Bug 1119579 Opened 5 years ago Closed 5 years ago

Assertion failure: !comp.ref().done(), at gc/Zone.h

Categories

(Core :: JavaScript Engine, defect, critical)

x86_64
macOS
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla37
Tracking Status
firefox35 --- wontfix
firefox36 --- fixed
firefox37 + fixed
firefox-esr31 36+ fixed
b2g-v1.4 --- fixed
b2g-v2.0 --- fixed
b2g-v2.0M --- fixed
b2g-v2.1 --- fixed
b2g-v2.1S --- fixed
b2g-v2.2 --- fixed

People

(Reporter: gkw, Assigned: shu)

References

(Blocks 2 open bugs)

Details

(4 keywords, Whiteboard: [adv-main36+][adv-esr31.5+])

Attachments

(4 files)

The upcoming testcase asserts js debug shell on m-c changeset 70de2960aa87 with  --fuzzing-safe --gc-zeal=14 --no-threads --ion-eager at Assertion failure: !comp.ref().done(), at gc/Zone.h

Debug configure options:

CC="clang -Qunused-arguments" CXX="clang++ -Qunused-arguments" AR=ar AUTOCONF=/usr/local/Cellar/autoconf213/2.13/bin/autoconf213 sh /Users/skywalker/trees/mozilla-central/js/src/configure --target=x86_64-apple-darwin12.5.0 --enable-debug --enable-optimize --enable-nspr-build --enable-more-deterministic --with-ccache --enable-gczeal --enable-debug-symbols --disable-tests

This was found by combining random js tests together with jsfunfuzz, the specific file(s) is/are:

http://hg.mozilla.org/mozilla-central/file/70de2960aa87/js/src/tests/ecma/LexicalConventions/7.4.3-14-n.js
http://hg.mozilla.org/mozilla-central/file/70de2960aa87/js/src/jit-test/tests/TypedObject/jit-read-unsized.js
http://hg.mozilla.org/mozilla-central/file/70de2960aa87/js/src/jit-test/tests/ion/bug725000.js
http://hg.mozilla.org/mozilla-central/file/70de2960aa87/js/src/jit-test/tests/debug/Debugger-onNewGlobalObject-02.js
http://hg.mozilla.org/mozilla-central/file/70de2960aa87/js/src/jit-test/tests/debug/Debugger-findAllGlobals-01.js

A bisection is coming up.

Setting s-s as a start since this is a scary assert, as Shu-yu puts it over IRC. Also setting needinfo? from him.
Flags: needinfo?(shu)
Attached file Archive.zip
Attached file stack
(lldb) bt 5
* thread #1: tid = 0x90ce7, 0x00000001004df744 js-dbg-opt-64-dm-nsprBuild-darwin-70de2960aa87`js::CompartmentsIterT<js::ZonesIter>::next(this=<unavailable>) + 468 at Zone.h:446, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x00000001004df744 js-dbg-opt-64-dm-nsprBuild-darwin-70de2960aa87`js::CompartmentsIterT<js::ZonesIter>::next(this=<unavailable>) + 468 at Zone.h:446
    frame #1: 0x0000000100641298 js-dbg-opt-64-dm-nsprBuild-darwin-70de2960aa87`js::Debugger::findAllGlobals(cx=0x0000000101f01cf0, argc=<unavailable>, vp=0x00007fff5fbfd488) + 536 at Debugger.cpp:3726
    frame #2: 0x000000010069e742 js-dbg-opt-64-dm-nsprBuild-darwin-70de2960aa87`js::CallJSNative(cx=0x0000000101f01cf0, native=0x0000000100641080, args=0x00007fff5fbfd3e0)(JSContext*, unsigned int, JS::Value*), JS::CallArgs const&) + 194 at jscntxtinlines.h:229
    frame #3: 0x00000001006495f4 js-dbg-opt-64-dm-nsprBuild-darwin-70de2960aa87`js::Invoke(cx=<unavailable>, args=<unavailable>, construct=<unavailable>) + 420 at Interpreter.cpp:495
    frame #4: 0x0000000100632bcf js-dbg-opt-64-dm-nsprBuild-darwin-70de2960aa87`js::Invoke(cx=0x0000000101f01cf0, thisv=<unavailable>, fval=<unavailable>, argc=<unavailable>, argv=<unavailable>, rval=<unavailable>) + 815 at Interpreter.cpp:558
(lldb)
autoBisect shows this is probably related to the following changeset:

The first bad revision is:
changeset:   https://hg.mozilla.org/mozilla-central/rev/b434e8ec434c
user:        Lars T Hansen
date:        Wed Jan 07 08:05:25 2015 +0100
summary:     Bug 1117764: disable PJS in Nightly builds. r=shu

Not sure if this is correct?
I don't see how that bisection can be right
I can't reproduce this on either OS X or Linux x64. What is the revision you reproduced this on?
Flags: needinfo?(shu)
(In reply to Shu-yu Guo [:shu] from comment #5)
> I can't reproduce this on either OS X or Linux x64. What is the revision you
> reproduced this on?

The revision is 70de2960aa87 and it's in comment 0. I also originally found this on Linux.
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #6)
> (In reply to Shu-yu Guo [:shu] from comment #5)
> > I can't reproduce this on either OS X or Linux x64. What is the revision you
> > reproduced this on?
> 
> The revision is 70de2960aa87 and it's in comment 0. I also originally found
> this on Linux.

Okay, I reproduced with that revision. Sorry I missed it comment 0!

Sometime between that revision and current tip on m-c, it no longer crashes. Let me bisect real quick...
I have apparently fixed this somehow, but I have no idea how. I'll just look at the bug and see what it is.

The first good revision is:
changeset:   222596:7584b643e7e9
user:        Shu-yu Guo <shu@rfrn.org>
date:        Wed Jan 07 01:18:42 2015 -0800
summary:     Bug 1118038 - Remove JIT parts of PJS. (r=lth)
Existing bug. Yay, not my fault.

We're collecting a compartment from under ourselves in findAllGlobals. How bad
is this, Terrence? Seems at least kinda bad.
Attachment #8546363 - Flags: review?(terrence)
Comment on attachment 8546363 [details] [diff] [review]
Don't GC while iterating compartments in findAllGlobals.

Review of attachment 8546363 [details] [diff] [review]:
-----------------------------------------------------------------

The last compartment in a zone will always be preserved, but (1) the whole zone could die, and (2) that doesn't help anyway because you could still end up pointing past the end of the compartments.

I'm going to r+, but I think that in addition to this fix we need to lie to the analysis by telling it that compartment iterators are GC things. Then the hazard analysis will catch this even without the AutoCheckCannotGC. (I'm happy to leave the nogc in, though, as documentation and a dynamic check.) That might scare up some false positives, but it would be good to at least try.

Thanks for the fix.
Attachment #8546363 - Flags: review?(terrence) → review+
https://hg.mozilla.org/mozilla-central/rev/9d9610c574b4
Status: NEW → RESOLVED
Closed: 5 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Target Milestone: --- → mozilla37
Does this affect older branches too? Since it sounds like the PJS removal didn't really cause this.

Speaking of which, this should probably have gotten a security rating and affected branches determined before landing.
https://wiki.mozilla.org/Security/Bug_Approval_Process
Assignee: nobody → shu
Keywords: sec-high
How far back is affected by this?
Flags: needinfo?(shu)
This crash is very hard to trigger, but goes all the way back to esr31.
Now that we've checked in a sec-high on 37, we'll need a patch created and nominated for 36 and ESR31.

Could you answer the questions in the sec-approval? template, as well? This should have gone through sec-approval before checkin.
Flags: needinfo?(shu)
Why should this have gone through sec-approval? It was central only at the time.
(In reply to Shu-yu Guo [:shu] from comment #17)
> Why should this have gone through sec-approval? It was central only at the
> time.

Oh doy, no it wasn't; the triggering mechanism was, but the bug was present. Sorry.
Comment on attachment 8546363 [details] [diff] [review]
Don't GC while iterating compartments in findAllGlobals.

[Security approval request comment]
How easily could an exploit be constructed based on the patch?

Seems to me to be very difficult. Unsure if it is exploitable, but triggering requires setting up globals *and* GC in such a way that seems difficult to achieve without the zeal parameter like we have in the testcase.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

Kinda. Should omit the check-in comment.

Which older supported branches are affected by this flaw?

At least back to ESR31, most likely before. So all.

If not all supported branches, which bug introduced the flaw?

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

In the process of getting an ESR31 clone to create the patch. I think ESR31 doesn't have AutoCheckCannotGC.

How likely is this patch to cause regressions; how much testing does it need?

Approval Request Comment
[Feature/regressing bug #]: No clue.
[User impact if declined]: Crashes in very rare circumstances when opening the debugger.
[Describe test coverage new/current, TreeHerder]: on m-c
[Risks and why]: Low, just bugfix.
[String/UUID change made/needed]: None
Flags: needinfo?(shu)
Attachment #8546363 - Flags: sec-approval?
Attachment #8546363 - Flags: approval-mozilla-beta?
Attachment #8546363 - Attachment is obsolete: true
Attachment #8546363 - Flags: sec-approval?
Attachment #8546363 - Flags: approval-mozilla-beta?
Comment on attachment 8546363 [details] [diff] [review]
Don't GC while iterating compartments in findAllGlobals.

Stupid bzexport auto-obsoleting...
Attachment #8546363 - Flags: sec-approval?
Attachment #8546363 - Flags: approval-mozilla-beta?
Attachment #8546363 - Attachment is obsolete: false
Attachment #8554954 - Flags: approval-mozilla-esr31?
Attachment #8546363 - Flags: sec-approval?
Attachment #8546363 - Flags: sec-approval+
Attachment #8546363 - Flags: approval-mozilla-beta?
Attachment #8546363 - Flags: approval-mozilla-beta+
Attachment #8554954 - Flags: approval-mozilla-esr31? → approval-mozilla-esr31+
Whiteboard: [adv-main36+][adv-esr31.5+]
Group: core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.