Assertion failure: [infer failure] Missing type for arg 1 in jsinfer.cpp

RESOLVED FIXED

Status

()

defect
RESOLVED FIXED
8 years ago
8 years ago

People

(Reporter: dbaron, Assigned: bhackett)

Tracking

({assertion})

Trunk
x86_64
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox7- unaffected, firefox8- unaffected, firefox9+ fixed, firefox10+ fixed, status1.9.2 unaffected)

Details

(Whiteboard: [sg:critical][qa+])

Attachments

(3 attachments, 1 obsolete attachment)

I've crashed twice in the past 24 hours (on a self-built Linux debug build based on mozilla-central cf1ba8f0dbf71900bb24d77b5a63ad82c279b113 with the following:

Assertion failure: _PT_PTHREAD_MUTEX_IS_LOCKED(lock->mutex), at /home/dbaron/builds/ssd/mozilla-central/mozilla/nsprpub/pr/src/pthreads/ptsynch.c:227

The first of the two crashes I debugged a bit:

#6  0x00007f5f9781dab6 in abort () at abort.c:92
#7  0x00007f5f988f89aa in PR_Assert (
    s=0x7f5f9891df88 "_PT_PTHREAD_MUTEX_IS_LOCKED(lock->mutex)",
    file=0x7f5f9891de68 "/home/dbaron/builds/ssd/mozilla-central/mozilla/nsprpub
/pr/src/pthreads/ptsynch.c", ln=227)
    at /home/dbaron/builds/ssd/mozilla-central/mozilla/nsprpub/pr/src/io/prlog.c
:587
#8  0x00007f5f9890d4a7 in PR_Unlock (lock=0x13d5350)
    at /home/dbaron/builds/ssd/mozilla-central/mozilla/nsprpub/pr/src/pthreads/p
tsynch.c:227
#9  0x00007f5f95ef8460 in AutoUnlockGC (cx=0x7f5ef501a110,
    fmt=<value optimized out>)
    at /home/dbaron/builds/ssd/mozilla-central/mozilla/js/src/jscntxt.h:1822
#10 print (cx=0x7f5ef501a110, fmt=<value optimized out>)
    at /home/dbaron/builds/ssd/mozilla-central/mozilla/js/src/jsinfer.cpp:2314
#11 js::types::TypeFailure (cx=0x7f5ef501a110, fmt=<value optimized out>)
    at /home/dbaron/builds/ssd/mozilla-central/mozilla/js/src/jsinfer.cpp:338
#12 0x00007f5f962a7efb in js::mjit::stubs::AssertArgumentTypes (f=...)
    at /home/dbaron/builds/ssd/mozilla-central/mozilla/js/src/methodjit/StubCall
s.cpp:2470
#13 0x00007f5f71834b97 in ?? ()

This looks like we're doing AutoUnlockGC when the GC is not locked.

The script being executed was Google-plus related, though I think I may have actually been on https://www.google.com/reader/view/  (I'm not going to paste URLs of the script or the frame because they may have auth info...)
Hmm, this should not crash here anymore on m-c tip (since a merge from JM last night), but the reason we were trying to unlock the GC was to dump state because of a type checking failure (the TypeFailure on the stack).  If you hit this again can you paste the dump and JS stack, or do you have more ideas about how to get this to repro?
(In reply to Brian Hackett from comment #1)
> Hmm, this should not crash here anymore on m-c tip (since a merge from JM
> last night), but the reason we were trying to unlock the GC was to dump
> state because of a type checking failure (the TypeFailure on the stack).  If
> you hit this again can you paste the dump and JS stack, or do you have more
> ideas about how to get this to repro?

Is "the dump" something that's printed to stdout?  If it's not going to crash anymore, what should I look for?
Yeah, it will print to stdout.  It will crash immediately afterwards, with an "[infer failure]" message in TypeFailure.
Posted file dump
Here's what it printed out before aborting.
Blocks: 688165
Summary: fatal NSPR assertion failure doing AutoUnlockGC when not locked: _PT_PTHREAD_MUTEX_IS_LOCKED(lock->mutex), at ptsynch.c:227 → Assertion failure: [infer failure] Missing type for arg 1 in jsinfer.cpp
So with the new debugging patch applied, I now get:

Assertion failure: [infer failure] Missing type pushed 0: [0x7f13a34e2e80], at /home/dbaron/builds/ssd/mozilla-central/mozilla/js/src/jsinfer.cpp:341

at the stack:

#4  <signal handler called>
#5  0x00007f13d3ed5b3b in raise (sig=<value optimized out>)
    at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
#6  0x00007f13d189dd66 in js::types::TypeFailure (cx=<value optimized out>, 
    fmt=<value optimized out>)
    at /home/dbaron/builds/ssd/mozilla-central/mozilla/js/src/jsinfer.cpp:341
#7  0x00007f13d18a003e in js::types::TypeScript::CheckBytecode (
    cx=0x7f13ad28d1a0, script=0x7f13a39582e0, pc=0x734b35c "7T", 
    sp=0x7f13b965a728)
    at /home/dbaron/builds/ssd/mozilla-central/mozilla/js/src/jsinfer.cpp:5510
#8  0x00007f13b3a26bfe in ?? ()
And, for the record, the latest crash was following a link to:
https://plus.google.com/photos/114490712483753086051/albums/5655320612646245745/5655320611468053890
and letting the page sit there briefly.  (It was something @limi tweeted; still not sure what it's a picture of, though.)
It is a picture of a bag of gummi worms, labelled "fat free".
Posted file dump with new patch
The dump with https://hg.mozilla.org/mozilla-central/rev/29c8fccd95ba applied to my tree, plus the output of js_DumpPC(cx).
Any progress here?  Leaving debug builds crashy for long periods of time really isn't ok.
I haven't able to reproduce this.
(In reply to Brian Hackett from comment #10)
> I haven't able to reproduce this.

Try this:

goto plus.google.com and sign in.

load https://plus.google.com/108176814619778619437/posts/RmbumJ1HmLX

press ^r to reload.

that crashes for me today 100% of the time with the jsinfer assertion.

I'm working on spdy support, so I spend a lot of time trawling google for tests cases with debug builds :)
Group: core-security
Posted patch patch (obsolete) — Splinter Review
Thanks, with the latest instructions I was able to repro.  Sorry about the long delay here.

This is a reentrancy problem while solving type constraints.  Type barriers (a kind of runtime check for improving analysis precision) are added for JS opcodes during type inference, and can also be forcibly removed in some situations (analyzing how properties are added when calling 'new' on a script).  There was a reentrance problem when removing type barriers, such that if removing the barrier triggered new barriers at the same opcode then those new barriers could vanish, leading to an inconsistent state and incorrect type information (which was caught by the assert).

While this page doesn't manifest in a (non-debug) crash, this should I think be fixed on aurora because bad type information can lead to generating incorrect (possibly crashing) jitcode.
Assignee: general → bhackett1024
Attachment #564436 - Flags: review?(dvander)
Attachment #564436 - Flags: approval-mozilla-aurora?
Comment on attachment 564436 [details] [diff] [review]
patch

We won't approve until it has a review. Please renominate if you want when it has a review.
Attachment #564436 - Flags: approval-mozilla-aurora?
Even with the patch, I'm still crashing when leaving Google Plus pages.
Posted patch updatedSplinter Review
Oops, I attached the wrong version of the patch.
Attachment #564436 - Attachment is obsolete: true
Attachment #564436 - Flags: review?(dvander)
Attachment #564889 - Flags: review?(dvander)
When you hand-edit patches, you should fiddle the chunk headers appropriately too. :-)
Attachment #564889 - Flags: review?(dvander) → review+
Attachment #564889 - Flags: approval-mozilla-aurora?
Marking sg:critical due to potential for generating bad jitcode. Feel free to revise.
Whiteboard: sg:critical
Attachment #564889 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Please let this bake on mozilla-central for a few days before landing on aurora.
Whiteboard: sg:critical → [sg:critical]
https://hg.mozilla.org/mozilla-central/rev/617182196237
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Duplicate of this bug: 688165
QA to verify fix using testcase in comment 11.
Whiteboard: [sg:critical] → [sg:critical][qa+]
Group: core-security
You need to log in before you can comment on or make changes to this bug.