Closed Bug 615041 Opened 15 years ago Closed 12 years ago

Crash [@ js::MarkRuntime ]

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
Tracking Status
blocking2.0 --- final+

People

(Reporter: scoobidiver, Unassigned)

References

Details

(Keywords: crash, qawanted, regression, Whiteboard: softblocker)

Crash Data

Attachments

(1 file)

It is a new crash signature that exists in 4.0b7 and 4.0b8pre builds. It happens at startup on Linux. It is #10 top crasher on Linux in 4.0b8pre for the last week. Signature js::MarkRuntime UUID e64d8b52-33dc-4b33-9d5d-eae4a2101126 Time 2010-11-26 04:55:01.387114 Uptime 3 Last Crash 6 seconds before submission Install Age 10 seconds since version was first installed. Product Firefox Version 4.0b8pre Build ID 20101126030319 Branch 2.0 OS Linux OS Version 0.0.0 Linux 2.6.27-67vl5 #1 SMP Tue Nov 2 15:59:14 JST 2010 i686 CPU x86 CPU Info GenuineIntel family 6 model 14 stepping 8 Crash Reason SIGBUS Crash Address 0xbfaae000 User Comments Processor Notes EMCheckCompatibility False Crashing Thread Frame Module Signature [Expand] Source 0 libxul.so js::MarkRuntime jsgc.cpp:681 1 libxul.so js_GC jsgc.cpp:2139 2 libxul.so JS_GC jsapi.cpp:2503 3 libxul.so nsJSContext::ScriptEvaluated nsJSEnvironment.cpp:890 4 libxul.so nsJSContext::ExecuteScript nsJSEnvironment.cpp:1923 5 libxul.so nsXULDocument::ExecuteScript nsXULDocument.cpp:3622 6 libxul.so nsXULDocument::ExecuteScript nsXULDocument.cpp:3645 7 libxul.so nsXULDocument::ResumeWalk nsXULDocument.cpp:2995 8 libxul.so nsXULDocument::OnStreamComplete nsXULDocument.cpp:3576 9 libxul.so nsStreamLoader::OnStopRequest nsStreamLoader.cpp:125 10 libxul.so nsJARChannel::OnStopRequest nsJARChannel.cpp:906 11 libxul.so nsInputStreamPump::OnStateStop nsInputStreamPump.cpp:578 12 libxul.so nsInputStreamPump::OnInputStreamReady nsInputStreamPump.cpp:403 13 @0xb76cc26f 14 libxul.so nsThread::ProcessNextEvent nsThread.cpp:626 15 libxul.so NS_ProcessNextEvent_P nsThreadUtils.cpp:250 16 libxul.so mozilla::ipc::MessagePump::Run MessagePump.cpp:110 17 libxul.so MessageLoop::RunInternal message_loop.cc:219 18 libxul.so MessageLoop::Run message_loop.cc:202 19 libxul.so nsBaseAppShell::Run nsBaseAppShell.cpp:181 20 libxul.so nsAppStartup::Run nsAppStartup.cpp:191 21 @0xb76cc26f 22 libxul.so XRE_main nsAppRunner.cpp:3691 23 firefox-bin main browser/app/nsBrowserApp.cpp:158 24 libc-2.8.so libc-2.8.so@0x1659f 25 firefox-bin firefox-bin@0x1390 26 firefox-bin Output browser/app/nsBrowserApp.cpp:77 27 @0x0 More reports at: http://crash-stats.mozilla.com/report/list?range_value=4&range_unit=weeks&signature=js%3A%3AMarkRuntime
Severity: critical → blocker
blocking2.0: --- → ?
It is #17 top crasher on Linux in 4.0b9 for the last week. Comments say: "With WebSockets enabled and using noVNC (https://github.com/kanaka/noVNC), the second time connecting to a server caused an immediate crash." "using keyboard to interact with webgl app"
Assignee: general → anygregor
seems to involve the conservative stack scanner during startup?
We should not block on this until we have STR.
Keywords: qawanted
Some comments from crash stats: With WebSockets enabled and using noVNC (https://github.com/kanaka/noVNC), the second time connecting to a server caused an immediate crash. using keyboard to interact with webgl app
blocking2.0: ? → final+
Whiteboard: softblocker
I've seen this message in Valgrind at startup. It might be related. ==6063== Invalid read of size 8 ==6063== at 0x72B000D: js::MarkRangeConservatively(JSTracer*, unsigned long const*, unsigned long const*) (jsgc.cpp:721) ==6063== by 0x72B00D1: js::MarkThreadDataConservatively(JSTracer*, JSThreadData*) (jsgc.cpp:738) ==6063== by 0x72B0205: js::MarkConservativeStackRoots(JSTracer*) (jsgc.cpp:771) ==6063== by 0x72B1BB8: js::MarkRuntime(JSTracer*) (jsgc.cpp:1604) ==6063== by 0x72B3016: MarkAndSweep(JSContext*, JSGCInvocationKind) (jsgc.cpp:2160) ==6063== by 0x72B3C0D: GCUntilDone(JSContext*, JSGCInvocationKind) (jsgc.cpp:2489) ==6063== by 0x72B3DAB: js_GC(JSContext*, JSGCInvocationKind) (jsgc.cpp:2554) ==6063== by 0x7227D3E: JS_GC (jsapi.cpp:2556) ==6063== by 0x6716E78: nsXPConnect::Collect() (nsXPConnect.cpp:407) ==6063== by 0x6716F00: nsXPConnect::GarbageCollect() (nsXPConnect.cpp:415) ==6063== by 0x617FCD0: nsJSContext::CC(nsICycleCollectorListener*) (nsJSEnvironment.cpp:3644) ==6063== by 0x617FEDD: nsJSContext::IntervalCC() (nsJSEnvironment.cpp:3749) ==6063== Address 0x7feff8e30 is not stack'd, malloc'd or (recently) free'd
(In reply to comment #5) > I've seen this message in Valgrind at startup. It might be related. > With --enable-valgrind?
(In reply to comment #6) > With --enable-valgrind? Yes. Also, it's an invalid read, not an uninitialized read. So it's reading from unmapped memory. Also see bug 626064. It seems like, if we get unlucky, the invalid read could be to an unmapped page. And then we'd crash. I don't really understand how the stack scanner works. How does it decide where to start and stop scanning?
Summary: startup crash [@ js::MarkRuntime ] → Crash [@ js::MarkRuntime ]
Maybe that's related to bug 578233?
I see this with valgrind right now: vex amd64->IR: unhandled instruction bytes: 0xF 0x2 0x4D 0x89 0xE7 0x4C ==12121== valgrind: Unrecognised instruction at address 0x254146cc. ==12121== Your program just tried to execute an instruction that Valgrind ==12121== did not recognise. There are two possible reasons for this.
(In reply to comment #9) > vex amd64->IR: unhandled instruction bytes: 0xF 0x2 0x4D 0x89 0xE7 0x4C Does --smc-check=all help?
(In reply to comment #10) > (In reply to comment #9) > > vex amd64->IR: unhandled instruction bytes: 0xF 0x2 0x4D 0x89 0xE7 0x4C > > Does --smc-check=all help? Yes works fine now. Thx!
(In reply to comment #0) > More reports at: > http://crash-stats.mozilla.com/report/list?range_value=4&range_unit=weeks&signature=js%3A%3AMarkRuntime All of these failed with SIGBUS, which is relatively unusual. Also, most of the faulting addresses are a handful of pages below 0xC0000000 (3GB - epsilon), which is probably the main thread's stack.
(In reply to comment #7) > I don't really understand how the stack scanner > works. How does it decide where to start and stop scanning? Yeah. It's a good question. On Linux and OSX we ask libpthread where the upper end of the stack is, in GetNativeStackBaseImpl (see bug 608526) and that seems OK to me. For the lower end we appear to use ConservativeGCThreadData::recordStackTop, which simply takes the address of a local variable. I'd feel more comfortable if there were some comments about how it accounts for / interacts with the x86_64-ELF mandated redzone area (-128(%rsp) .. -1(%rsp)).
(In reply to comment #12) > Also, most of the faulting addresses are a handful of pages > below 0xC0000000 (3GB - epsilon), which is probably the > main thread's stack. .. in which case it would be useful to know the value of %esp or %rsp at the faulting instruction. Is that info available in the crash dumps? I couldn't see it.
Severity: blocker → major
Having looked at a couple of the crashes in more detail, it's clear something suspicious is happening at the top of the main thread's stack (high-addressed end). The stack scanner is running into a page which is unmapped, and beyond that is another mapped page, so the stack has been split. This is very similar, although not identical, to what happened in bug 608526.
(In reply to comment #15) > Having looked at a couple of the crashes in more detail, it's clear > something suspicious is happening at the top of the main thread's > stack (high-addressed end). The stack scanner is running into a page > which is unmapped, and beyond that is another mapped page, so the > stack has been split. This is very similar, although not identical, > to what happened in bug 608526. That sounds bad. Julian, do you think it would help to record the start address in the main function? I think you suggested that somewhere.
(In reply to comment #16) > That sounds bad. Julian, do you think it would help to record the start address > in the main function? I think you suggested that somewhere. Post-4.0 I think that would be a good thing to do. However for the moment I'd rather figure out how to reproduce the failure, and work around it with the current code if possible.
Happens to me almost every time when i start Fx4 @ work. Tho' no idea what triggers it as i have almost same conf @ home and doesn't happen there :S
(In reply to comment #5) > I've seen this message in Valgrind at startup. It might be related. I don't think this is related. The message pertains to badness going on at the lower-addressed end of the stack, whereas all the failures here have to do with badness at the high-addressed end. I can't figure out how to reproduce the unmapped-page-at-top-of-stack phenomenon. What we need is someone who can reliably reproduce it and can strace the run, so we can at least confirm that the missing page is missing because something unmapped it.
Commands and packages that are needed to strace it? I could try to reproduce it but i don't know how to strace.
Sander, strace is easy to run. Just prefix your command with 'strace'.
Well, i seem to have some problems with strace. When i started Firefox with strace, it ran some time (under 1 minute) and then whole X got frozen. Killed strace and copied some last lines from it. There was a LOT of similar. Don't know if it helps, but that's the best i can get out of it. :/
Does this crash still happen with beta 11?
Yes. :(
Or maybe not, i was thinking it is the same bug. But now watching @ about:crashes i see that the startup crash bug is replaced with this one: https://bugzilla.mozilla.org/show_bug.cgi?id=635163 Maybe related?
In reply to comment 23 > Does this crash still happen with beta 11? Only one crash in 4.0b11 on Mac OS X with a different stack trace: bp-f2f4c340-77c6-4575-8ee5-56f752110213 I close it as "work for me".
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → WORKSFORME
I reopened it as it shows up as #11 top crasher on Linux in 4.0. Stack traces now look like: 0 libxul.so js::MarkRuntime js/src/jsgc.cpp:710 1 libxul.so js_GC js/src/jsgc.cpp:2411 2 libxul.so JS_GC js/src/jsapi.cpp:2662 3 libxul.so nsXPConnect::Collect js/src/xpconnect/src/nsXPConnect.cpp:405 4 libxul.so nsXPConnect::GarbageCollect js/src/xpconnect/src/nsXPConnect.cpp:413 5 libxul.so nsJSContext::GarbageCollectNow dom/base/nsJSEnvironment.cpp:3271 6 libxul.so nsTimerImpl::Fire xpcom/threads/nsTimerImpl.cpp:425 7 libxul.so nsTimerEvent::Run xpcom/threads/nsTimerImpl.cpp:517 8 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:633
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Crash Signature: [@ js::MarkRuntime ]
I've seen a number of theses crashes with FF4.0.1 for no obvious reason or on any particular page or action. This is an example. https://crash-stats.mozilla.com/report/index/ad1ba701-b5f6-45ac-8e9f-b47842110616
Although this bug is not marked as solved it has disappeared after I have upgraded to Firefox 5.0 (latest release) (bug no longer applicable? upgraded code for another reason?). No crash after 5 days of same pattern of firefox use as before, under which firefox 4.0.x version crashed several times in the same day... Mandriva Linux release 2010.2 (Official) for i586 linux kernel: 2.6.36.2-desktop-2mnb Here are the crash reports submitted by me from FF 4.0 or 4.0.1 crashes (my computer was running Mandriva Linux 2010.1 / custom 2.6.x kernel at that time): bp-16348855-ba45-4227-a2f0-36a752110422 bp-e974857e-4b7b-4e55-9ff9-3ec6a2110422 bp-906ffa51-8537-4fe9-b69b-96b9a2110420 bp-ada51d70-9312-4704-a886-415c52110420 bp-54f6c6c3-b479-41c4-b67a-30b472110420 bp-8240af40-447f-4941-babe-466d22110417 bp-bf6bc5da-a4f0-4558-b4ed-97fbd2110416 bp-5b0ccb44-63bf-4904-9204-bb2b32110416 bp-70dcd4ed-d3bb-454b-a228-92ed52110416 bp-2aa0cbfe-f672-4694-b940-768a12110416 bp-d4a0fa4d-5cce-4a41-9ab4-a7a3c2110415 bp-7fe5ec92-08a6-494a-9559-356a02110415 bp-d4407f89-a353-4398-a36f-acb342110413 bp-064dc141-a2b0-481d-98c8-3ba742110412 bp-bb50a840-1690-47be-8639-02dcd2110408 bp-537c1b31-46c9-40b0-b394-2c4342110407 bp-960b9e53-8f7a-435c-b1f1-8186c2110406
I had one crash also in TB5. That happened right after upgrading from TB3.1.11 when TB5 was started the very first time. https://crash-stats.mozilla.com/report/index/bp-b30ff970-ad63-483b-82c1-425522110628 It did not crash anymore since.
I don't have time to look into this right now. Dave you might want to assign this to somebody else.
Assignee: anygregor → general
(In reply to comment #34) > I've had seven crashes today within the space of 10 minutes > > http://crash-stats.mozilla.com/report/index/bp-9efb2d42-0f78-4a0b-9b5b- > ff6712110717 > http://crash-stats.mozilla.com/report/index/bp-fe62288d-2bab-4f0d-85f5- > 8978f2110717 > http://crash-stats.mozilla.com/report/index/bp-9856eae2-28ae-426c-9fc1- > 048712110717 > http://crash-stats.mozilla.com/report/index/bp-34af8687-87ef-41a7-85ca- > 71d162110717 > http://crash-stats.mozilla.com/report/index/bp-35fa49f2-d563-4ec9-940c- > 224452110717 > http://crash-stats.mozilla.com/report/index/bp-85ddbce5-0420-413c-965a- > 6283e2110717 > http://crash-stats.mozilla.com/report/index/bp-8c1be289-05e5-4015-826e- > 7011b2110717 > > All happened when I was asked for the master password to log-on to a site. > If I can provide any more information let me know. Can you still reproduce it? In comment 19, Julian said it would help to have the output from strace for a crashing run.
I just tried multiple times to reproduce and can't get it to crash. It seems very intermittent but here when it does start it crashes multiple times
The bug was forgotten? Then at least uncheck it as blocking final.
Severity: major → critical
Crash Signature: [@ js::MarkRuntime ] → [@ js::MarkRuntime]
OS: Linux → All
Hardware: x86 → All
Chris, surely your other bugs - bug 679537, bug 680586, bug 693218 - are related or duplicates? Chris' most recent crash is bp-7b14eed8-bbf4-4b8a-9aa2-93aca2120802
Since they're for much older versions I would say they can be closed unless someone running that version is still experiencing the crash.
(In reply to Chris from comment #40) > Duplicate of this bug: 679537 from bug 679537: http://crash-stats.mozilla.com/report/index/bp-5105fc45-a2d5-45f1-9d4b-3e0d82110818 "was in the new, empty profile in safe mode and trying to set a master password."
There have been no crashes for the last four weeks after 19.0.2.
Status: REOPENED → RESOLVED
Closed: 14 years ago12 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: