Closed
Bug 1114079
Opened 9 years ago
Closed 9 years ago
Multiple crash signatures in JS error reporting (js::ReportOverRecursed, js::ErrorToException)
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
People
(Reporter: johnp, Assigned: jandem)
References
Details
(Keywords: crash, reproducible)
Crash Data
Attachments
(2 files, 1 obsolete file)
7.63 MB,
text/plain
|
Details | |
5.34 KB,
patch
|
bzbarsky
:
review+
lizzard
:
approval-mozilla-aurora+
lizzard
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
Experienced this startup crash with todays nightly: bp-c0b8f51b-aeb9-4aef-a755-753222141220 I'm really not sure where to place this, as there's kind of "a bit of everything" in the stacktrace. Also there are a lot of different stack traces going back some time, most ending up with NewStringDeflated<1> or NewStringDeflated<0>.
Reporter | ||
Comment 1•9 years ago
|
||
This seems to reproduce on every startup with my profile, except when restoring my session from about:sessionrestore: bp-704b3cd0-238d-4455-b899-66c022141221 bp-5e01dc1e-32b6-4ae6-b264-e14712141221 Moving to "JavaScript Engine" as the top frames look like some JS error handling code and crashing there really doesn't look nice.
Reporter | ||
Comment 2•9 years ago
|
||
Blocking bug 1107777 as the top third frame shows code added there right before this crashes started to occur.
Blocks: 1107777
(In reply to Johannes Pfrang [:johnp] from comment #1) > This seems to reproduce on every startup with my profile, except when > restoring my session from about:sessionrestore: > > bp-704b3cd0-238d-4455-b899-66c022141221 > bp-5e01dc1e-32b6-4ae6-b264-e14712141221 > > Moving to "JavaScript Engine" as the top frames look like some JS error > handling code and crashing there really doesn't look nice. So if you startup after a crash, and see the about:sessionrestore page, and click "Restore Session", does it crash? Could you try isolating if this happens due to a particular website or something in the browser by trying in a blank session and individually closing tabs? Thanks!
Bug 1053999, which added the recursion check that seems to trigger this, mentions ask.com urls being a possible culprit (see comment 29 of that bug). Do you have the ask toolbar or any tabs on ask.com?
Sorry, I didn't see the extensions list in the crash stats before. Could you try with all extensions disabled? It would be great if you could find a reliable STR.
This is most definitely an OOM, and we were not hitting it previously because the JS_IsRunning() failure would prevent js_ErrorToException() from being called. bholley, I think this is related to the change in stack limits from Bug 1053999, the error string "too much recursion" probably isn't fitting in the small amount of memory we keep around in the conservative checker after all the copying that happens in the error handler bits. But I'm just speculating.
Flags: needinfo?(bobbyholley)
Reporter | ||
Comment 7•9 years ago
|
||
I'm not sure if I can still reproduce this. Fwiw I also had one crash from restoring from about:sessionrestore and I don't use ask.com or the ask toolbar. If I encounter it again I'll definitely try to get some STR.
Comment 8•9 years ago
|
||
So you're suggesting that js_ExpandErrorArguments is OOMing when allocating the string, but not returning false? http://hg.mozilla.org/mozilla-central/annotate/490f124d7dea/js/src/jscntxt.cpp#l819 That seems reasonable, but offhand I don't see where that would be happening.
Flags: needinfo?(bobbyholley)
Reporter | ||
Comment 9•9 years ago
|
||
One month has gone by, so I'll resolve this WORKSFORME, as long as no one has objections.
Status: UNCONFIRMED → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
Reporter | ||
Comment 10•9 years ago
|
||
Wow... four bug 1125734 crashes today and just like normal checking the crash-reports after each crash... thought this would be just another one of these, but then I saw this bugs signature in the report. Somehow this now hit me on my desktop computer and not on my laptop like all the other crashes last month. bp-04a75a3d-38ec-4cda-8f31-7323b2150126
Status: RESOLVED → UNCONFIRMED
Resolution: WORKSFORME → ---
Reporter | ||
Comment 11•9 years ago
|
||
:nsm, just trying a shot in the dark, but this has somehow been starting to crash with the same nightly as bug 1125734 and both are practically indistinguishable for me until checking the about:crashes because both happen nearly every time I'm restoring my session... could bug 1125734 probably trigger this bug (as this is something about JS Errors/Exceptions and the other one a JS crash)? Unfortunately I don't have access to that bug so I just have to guess and wait for it to be fixed if I can still reproduce it then...
Flags: needinfo?(nsm.nikhil)
Johannes, if you can reproduce this in a debugger it would be good to know the value of message in https://dxr.mozilla.org/mozilla-central/source/js/src/jsexn.cpp#574 where this is a call to JS_NewStringCopyZ(cx, message). I'm unable to reproduce this. Bill, the stack length check was changed to conservative in Bug 1053999. Do you have any ideas?
Flags: needinfo?(nsm.nikhil) → needinfo?(wmccloskey)
The crashes are always on Windows https://crash-stats.mozilla.com/report/list?product=Firefox&signature=NewStringDeflated%3Cint%3E#tab-reports and the trace does not always come from SetNewWindow(). This could be related to 1125734, but I don't know enough about the JS engine to be convinced of anything.
Not sure what's happening here. It looks like there's plenty of virtual and physical memory available, so I'd be surprised if we're OOMing I would suggest waiting until bug 1125734 is fixed before digging into this one too much. It's possible they have the same root cause. There's work happening in bug 1125734, so hopefully we shouldn't have to wait too long.
Flags: needinfo?(wmccloskey)
Reporter | ||
Comment 15•9 years ago
|
||
Well, I reproduced a crash (not sure which one) with Visual Studio Debugger attached configured with the Mozilla symbol and source servers. Unfortunately the symbols for xul.dll won't load. I'm on the latest x64 nightly, is there anything wrong with symbol generation for 64-bit xul.dll?
Reporter | ||
Comment 16•9 years ago
|
||
Ok, manually downloaded the xul.pdb, now it's resolving fine but unfortunately it's bug 1125734. I'll keep the debugger attached for my next couple browsing sessions.
Bug 1125734 got fixed. Could you try now? Thanks a lot for all your effort!
I'm not sure that fix has hit nightly yet. Maybe better to wait until tomorrow.
Reporter | ||
Comment 19•9 years ago
|
||
Yeah, I just crashed with that signature again... No crash today with NewStringDeflated<int>, but I'll keep the debugger attached for the next couple days at least and hope I hit it again.
Reporter | ||
Comment 20•9 years ago
|
||
Ahem.... I just crashes with NewStringDeflated<int> and VS debugger attached, but instead of VS the crash-reporter showed up and my debugger disattached :( bp-ef696ab8-f2a8-42f8-9847-436502150128 Any idea why that should happen?
Reporter | ||
Comment 21•9 years ago
|
||
So, I set a tracepoint on the message variable and got a 27k line log, I'll probably have to clean up a bit or mail to one of you, because that's from loading my full browser session... sneak peak: js_ErrorToException message: 0x000000c74076ce40 "too much recursion" also, prior to this happening my firefox and visual studio both hanged while the log filled itself with js_ErrorToException message: 0x000000c6e9ed4310 "py6üÆ" Just so many messages the debugger wasn't fast enough it seems.
Reporter | ||
Comment 22•9 years ago
|
||
More of that: js_ErrorToException message: 0x000000c6e9ea1950 "@\tŸ\vÇ" js_ErrorToException message: 0x000000c6e9ea1950 "ð\nŸ\vÇ" js_ErrorToException message: 0x000000c6e9ea1930 "Ð\x13Æ\vÇ" js_ErrorToException message: 0x000000c6e9ea0cf0 "PÛÆ\vÇ" js_ErrorToException message: 0x000000c6e9ea2a40 "pÜÆ\vÇ" this doesn't look like valid messages...
Reporter | ||
Comment 23•9 years ago
|
||
Here are the last ~7k log lines starting from right before the dubious "py6üÆ" messages started with most non-file:// URLs removed. If you need the complete log or if I should try to set a tracepoint elsewhere just ask ;) And just fyi, the debugger still detached and the crash-reporter showed up.
Are all these traces after 1125734? If you are using Nightly that patch isn't in there yet.
Reporter | ||
Comment 25•9 years ago
|
||
No, they're this bug. On occurrences of bug 1125734 my debugger worked and I didn't get the crash-reporter, whereas on the NewStringDeflated<int> crashes my debugger deattached and the crash-reporter showed. The first time I got a NewStringDeflated<int> today this hit me unexpected, but for the second time I added a trace-point to dump the content of the message variable, so I'd have results even when the debugger should detach again. The attachment is from that debugging session. I made sure it was this crash indeed by submitting the crash-report and checking the stack trace: bp-b74213a4-b7fd-4060-90d9-36c102150128 Fwiw I still keep the debugger attached and once got such a dubious message spam again (2k+ times), but it didn't crash this time: js_ErrorToException message: 0x000000b1fcd95a50 "à\x15²±" Do these traces look like bug 1125734? As I said I don't have access to that so I was just guessing here they would be related to this one.
(In reply to Johannes Pfrang [:johnp] from comment #25) > No, they're this bug. On occurrences of bug 1125734 my debugger worked and I > didn't get the crash-reporter, whereas on the NewStringDeflated<int> crashes > my debugger deattached and the crash-reporter showed. The first time I got a > NewStringDeflated<int> today this hit me unexpected, but for the second time > I added a trace-point to dump the content of the message variable, so I'd > have results even when the debugger should detach again. The attachment is > from that debugging session. I made sure it was this crash indeed by > submitting the crash-report and checking the stack trace: > > bp-b74213a4-b7fd-4060-90d9-36c102150128 > > Fwiw I still keep the debugger attached and once got such a dubious message > spam again (2k+ times), but it didn't crash this time: > > js_ErrorToException message: 0x000000b1fcd95a50 "à\x15²±" > > Do these traces look like bug 1125734? As I said I don't have access to that > so I was just guessing here they would be related to this one. The expectation is that Bug 1125734 is also causing these crashes, so please wait until Bug 1125734 hits nightly (so tomorrow as Bill said above), and try with tomorrow's build. That may resolve this. I agree that the strings are odd.
Reporter | ||
Comment 27•9 years ago
|
||
Still crashing with today's nightly. Unfortunately there was an error when trying to send the crash-report and my tracepoint didn't work because the xul.pdb didn't load the first time. Now the tracepoint should work again and I'll try to get it crash again. Fwiw I got a log showing the same ~200 lines of [JavaScript Error: "too much recursion" {file: "resource://gre/modules/commonjs/toolkit/loader.js -> resource://gre/modules/commonjs/sdk/system/events.js" line: 68}] with the last 8 lines only showing [JavaScript Error: "too much recursion"] and then [JavaScript Warning: "ReferenceError: reference to undefined property params.charset" {file: "chrome://browser/content/tabbrowser.xml" line: 1366}] First-chance exception at 0x00007FFD205BD28C (xul.dll) in firefox.exe: 0xC0000005: Access violation reading location 0x0000000000000128.
Reporter | ||
Comment 28•9 years ago
|
||
Just as an update: I had it crash once again with yesterdays nightly, but unfortunately the debugger wasn't attached, as it didn't crash for two days before that, and before that the crash often happened multiple times with the same session, so I figured I'd wait for the first and then catch the second in the debugger. Unfortunately that didn't happen, but instead I crashed with bug 1128527.
Reporter | ||
Comment 29•9 years ago
|
||
So, I finally got a log of this crash again. Attached is the complete 10MB log with URLs removed using "sed -r 's/http(s)?:\/\/[^ ^"]+/(URL)/g'".
Reporter | ||
Comment 30•9 years ago
|
||
So, it could be that the signature changed somehow... Here's a new crash with an additional stack frame: bp-4c1c0fd4-c65d-4072-b893-5cbaf2150217 The new frame being js::gc::CheckAllocatorState<1> in http://hg.mozilla.org/mozilla-central/annotate/09f4968d5f42/js/src/jsgcinlines.h#l459. I'm adding the crash signature here for the moment. If this new signature is unrelated I'd open a separate bug.
Crash Signature: [@ NewStringDeflated<int> ] → [@ NewStringDeflated<int> ]
[@ js::gc::CheckAllocatorState<int> ]
Reporter | ||
Comment 31•9 years ago
|
||
I've added a comment to the bug that introduced the new code we're crashing in now: Bug 1130475
Assignee | ||
Comment 32•9 years ago
|
||
Bug 1137001 has a simple testcase and I can reliably repro this in opt and debug builds. We're crashing under nsGlobalWindow::SetNewDocument -> js::ReportOverRecursed (the same recursion check as the one mentioned in comment 4). It's crashing while allocating the "too much recursion" string, in GCIfNeeded: 435 if (rt->gc.isIncrementalGCInProgress() && -> 436 cx->zone()->usage.gcBytes() > cx->zone()->threshold.gcTriggerBytes()) because cx->zone() is nullptr... (cx->compartment() is also nullptr) NI billm because of bug 1053999.
Flags: needinfo?(wmccloskey)
Assignee | ||
Updated•9 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Assignee | ||
Comment 33•9 years ago
|
||
We should probably also make js::ReportOverRecursed, js::ErrorToException and/or GCIfNeeded assert we entered a compartment, to make it easier to find these bugs.
Bobby, could you take a look at this? I don't understand how this error reporting stuff works. Something changed in bug 1107777 so that now we allocate an exception object when reporting this error. However, we don't have a compartment to allocate it in. What do we want to do here? Should we try to figure out the right compartment to use, or should we avoid allocating the exception at all?
Flags: needinfo?(wmccloskey) → needinfo?(bobbyholley)
Comment 36•9 years ago
|
||
Sorry for the delay on this. SetNewDocument should really be doing the recursion check before the AutoJSAPI is even instantiated, and just returning NS_ERROR_FAILURE in that case (with an NS_WARNING). We should probably abstract out the guts of JS_CHECK_RECURSION_LIMIT into MFBT or something, though we'd still have the annoying complication that the native stack limits are stored on the JSRuntime. All solvable problems. Long story short - no reason to call js::ReportOverRecursed on the freshly-created AutoJSAPI.
Flags: needinfo?(bobbyholley)
Reporter | ||
Updated•9 years ago
|
Crash Signature: [@ NewStringDeflated<int> ]
[@ js::gc::CheckAllocatorState<int> ] → [@ NewStringDeflated<int> ]
[@ js::gc::CheckAllocatorState<int> ]
[@ GCIfNeeded ]
Comment 37•9 years ago
|
||
The primary signature of this bug is spiking in Firefox 37.0b7, up to the #4 or even #3 position with around 2% of all crashes.
Comment 38•9 years ago
|
||
Also, those are the top URLs for this signature: 168 http://www.dinakaran.com/ 161 https://www.facebook.com/ 147 about:newtab 138 about:blank 129 http://www.amarujala.com/ All others are way lower, at 35 hits or less.
Comment 39•9 years ago
|
||
Is https://crash-stats.mozilla.com/report/list?signature=js%3A%3ANewStringCopyN%3Cint%2C+wchar_t%3E%28js%3A%3AExclusiveContext%2A%2C+wchar_t+const%2A%2C+unsigned+int%29 also the same issue?
Reporter | ||
Comment 40•9 years ago
|
||
This does not always look like the other signatures (all coming from js_ReportOverRecursed() ), but I found at least 1 crash that also has that function in it's stack: bp-241a3b5f-e7ca-4194-a50f-1d21e2150326 Also, I just crashed with yet another signature: bp-da5ed725-44c9-40d3-98cf-7e9dc2150327
Crash Signature: [@ NewStringDeflated<int> ]
[@ js::gc::CheckAllocatorState<int> ]
[@ GCIfNeeded ] → [@ NewStringDeflated<int> ]
[@ js::gc::CheckAllocatorState<int> ]
[@ GCIfNeeded ]
[@ js::NewStringCopyN<int, wchar_t>(js::ExclusiveContext*, wchar_t const*, unsigned int) ]
[@ js::gc::GCRuntime::tryNewTenuredThing<JSFatInlineString, int>(js::Exclusive…
Reporter | ||
Comment 41•9 years ago
|
||
Changing summary and setting reproducible keyword per bug 1137001.
Keywords: reproducible
Summary: Crash @ NewStringDeflated<int> → Multiple crash signatures in JS error reporting (js::ReportOverRecursed, js::ErrorToException)
Updated•9 years ago
|
OS: Windows 8.1 → Windows
Updated•9 years ago
|
status-firefox39:
--- → affected
Version: Trunk → 39 Branch
Comment 42•9 years ago
|
||
Crashed two times FF while trying to restore a session with the page "Well, this is embarrassing." -> No SessionRestore possible ATM !!! [@ js::gc::GCRuntime::tryNewTenuredThing<JSFatInlineString, int>(js::ExclusiveContext*, js::gc::AllocKind, unsigned __int64) ] https://crash-stats.mozilla.com/report/index/adadb666-69b7-4941-934f-9928d2150423 https://crash-stats.mozilla.com/report/index/643575fd-3ee7-4dff-ad03-2315e2150423
Severity: critical → blocker
Comment 43•9 years ago
|
||
FYI: SessionRestore worked with 32bit. [@ js::gc::GCRuntime::checkAllocatorState<int>(JSContext*, js::gc::AllocKind) ] https://crash-stats.mozilla.com/report/index/4523ed18-12d6-47e7-a5e9-1fac42150424 Other signature - same problem.
Crash Signature: , int>(js::ExclusiveContext*, js::gc::AllocKind, unsigned __int64) ] → , int>(js::ExclusiveContext*, js::gc::AllocKind, unsigned __int64) ]
[@ js::gc::GCRuntime::checkAllocatorState<int>(JSContext*, js::gc::AllocKind) ]
Comment 44•9 years ago
|
||
[Tracking Requested - why for this release]: The NewStringDeflated<int> signature has high volume in 38b - currently at #17 in the "Top Crashers for Firefox 38.0b" list.
status-firefox38:
--- → affected
status-firefox38.0.5:
--- → affected
tracking-firefox38:
--- → ?
tracking-firefox38.0.5:
--- → ?
Comment 45•9 years ago
|
||
Too late for 38.* but tracking for 39.
Updated•9 years ago
|
Comment 46•9 years ago
|
||
The NewStringDeflated<int> signature, which is pretty high in Firefox 38 (and 38.0.5) data, has a huge amount of hits for http://www.amarujala.com/ - is that helpful to find out what's going on here?
Assignee | ||
Comment 47•9 years ago
|
||
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #46) > has a huge amount of hits for http://www.amarujala.com/ - > is that helpful to find out what's going on here? We already know what's going on here, see comments 32-36 (assuming it's the same issue). Bobby, do you have some spare cycles to take a stab at this? Else we can find somebody else maybe.
Flags: needinfo?(bobbyholley)
Assignee | ||
Comment 48•9 years ago
|
||
(In reply to Bobby Holley (:bholley) from comment #36) > SetNewDocument should really be doing the > recursion check before the AutoJSAPI is even instantiated, and just > returning NS_ERROR_FAILURE in that case (with an NS_WARNING). We should > probably abstract out the guts of JS_CHECK_RECURSION_LIMIT into MFBT or > something, though we'd still have the annoying complication that the native > stack limits are stored on the JSRuntime. All solvable problems. I think a simple fix is to add JS_CHECK_RECURSION_CONSERVATIVE_DONT_REPORT, a combination of JS_CHECK_RECURSION_CONSERVATIVE and JS_CHECK_RECURSION_DONT_REPORT. I can give this a try tomorrow.
Flags: needinfo?(bobbyholley) → needinfo?(jdemooij)
Comment 49•9 years ago
|
||
It's getting late in release cycle for Beta 39, we're about to go into Beta 4. Can still take a fix this week if it is verified first on nightly. Do we know if 40 or 41 are affected?
Updated•9 years ago
|
Comment 51•9 years ago
|
||
Yes; I found some on 41.
Crash Signature: , int>(js::ExclusiveContext*, js::gc::AllocKind, unsigned __int64) ]
[@ js::gc::GCRuntime::checkAllocatorState<int>(JSContext*, js::gc::AllocKind) ] → , int>(js::ExclusiveContext*, js::gc::AllocKind, unsigned __int64) ]
[@ js::gc::GCRuntime::checkAllocatorState<int>(JSContext*, js::gc::AllocKind) ]
[@ js::gc::GCRuntime::tryNewTenuredThing<JSFatInlineString, int>(js::ExclusiveContext*, js::gc::AllocKin…
Crash Signature: , js::gc::AllocKind, unsigned int) ] → , js::gc::AllocKind, unsigned int) ]
[@ js::Allocate<JSFatInlineString, int>(js::ExclusiveContext*) ]
Assignee | ||
Comment 52•9 years ago
|
||
This is basically what I described in comment 48: a new macro to check for the overrecursion that doesn't report the error. Then we can return NS_ERROR_FAILURE when that happens.
Assignee: nobody → jdemooij
Status: NEW → ASSIGNED
Flags: needinfo?(jdemooij)
Attachment #8620953 -
Flags: review?(bzbarsky)
Reporter | ||
Updated•9 years ago
|
Attachment #8556006 -
Attachment is obsolete: true
Comment 53•9 years ago
|
||
Comment on attachment 8620953 [details] [diff] [review] Patch r=me
Attachment #8620953 -
Flags: review?(bzbarsky) → review+
Updated•9 years ago
|
Crash Signature: [@ NewStringDeflated<int> ]
[@ js::gc::CheckAllocatorState<int> ]
[@ GCIfNeeded ]
[@ js::NewStringCopyN<int, wchar_t>(js::ExclusiveContext*, wchar_t const*, unsigned int) ]
[@ js::gc::GCRuntime::tryNewTenuredThing<JSFatInlineString, int>(js::Exclusive… → [@ NewStringDeflated<int> ]
[@ js::gc::CheckAllocatorState<int> ]
[@ GCIfNeeded ]
[@ js::NewStringCopyN<int, wchar_t>(js::ExclusiveContext*, wchar_t const*, unsigned int) ]
[@ js::NewStringCopyN<T>(js::ExclusiveContext*, wchar_t const*, unsigned int) …
Assignee | ||
Comment 55•9 years ago
|
||
Comment on attachment 8620953 [details] [diff] [review] Patch Approval Request Comment [Feature/regressing bug #]: Bug 1107777 or bug 1053999. [User impact if declined]: Topcrash. [Describe test coverage new/current, TreeHerder]: On m-i. [Risks and why]: Low risk. [String/UUID change made/needed]: None.
Attachment #8620953 -
Flags: approval-mozilla-beta?
Attachment #8620953 -
Flags: approval-mozilla-aurora?
Comment 56•9 years ago
|
||
Waiting for this to be green on m-c before uplift to 39 and 40.
Comment 57•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/4b42c4d4e4a7
Status: ASSIGNED → RESOLVED
Closed: 9 years ago → 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
Comment 58•9 years ago
|
||
Comment on attachment 8620953 [details] [diff] [review] Patch Approved for uplift to aurora and beta. May help fix a startup crash.
Attachment #8620953 -
Flags: approval-mozilla-beta?
Attachment #8620953 -
Flags: approval-mozilla-beta+
Attachment #8620953 -
Flags: approval-mozilla-aurora?
Attachment #8620953 -
Flags: approval-mozilla-aurora+
You need to log in
before you can comment on or make changes to this bug.
Description
•