Closed Bug 503772 Opened 15 years ago Closed 13 years ago

Crash [@ js_TraceObject]

Categories

(Core :: JavaScript Engine, defect)

1.9.1 Branch
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
Tracking Status
blocking2.0 --- -
status1.9.2 --- ?
blocking1.9.1 --- -
status1.9.1 --- wanted

People

(Reporter: gkw, Unassigned)

References

Details

(Keywords: crash, topcrash, Whiteboard: [needs STR or input on whether the stack trace is useful in fixing the issue][crashkill])

Crash Data

Topcrash for the past week at #27 for "Results within 1 weeks of now, and the product is one of Firefox, and the version is one of Firefox:3.5." Occurs on all platforms and using 3.5 final release. http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.5&query_search=signature&query_type=exact&query=&date=&range_value=1&range_unit=weeks&do_query=1&signature=js_TraceObject It'll be cool if there was a list of URLs that crash at this location.
Flags: blocking1.9.1.1?
blocking1.9.1: --- → -
Flags: wanted1.9.1.x+
Flags: blocking1.9.1.1?
Flags: blocking1.9.1.1-
Group: core-security
Whiteboard: [needs STR or input on whether the stack trace is useful in fixing the issue]
Whiteboard: [needs STR or input on whether the stack trace is useful in fixing the issue] → [needs STR or input on whether the stack trace is useful in fixing the issue][sg:investigate]
Flags: wanted1.9.1.x+
I have a user in the German channel which has this problem regularly. I will try to get some more information from him.
Group: core-security
(In reply to comment #2) > I have a user in the German channel which has this problem regularly. I will > try to get some more information from him. Whimboo, any update on this with information that we could use to create a testcase ?
Sadly not. I haven't written down his name or email address. I never got feedback.
Whiteboard: [needs STR or input on whether the stack trace is useful in fixing the issue][sg:investigate] → [needs STR or input on whether the stack trace is useful in fixing the issue][sg:investigate][crashkill]
(In reply to comment #4) > Sadly not. I haven't written down his name or email address. I never got > feedback. thats ok, i will see that i find steps to reproduce
bc: do we a need a new list here ?
Depends on: 528325
Never nominated, but marking blocking1.9.2- to explicitly mark [CrashKill] bugs as either blocking or not. If we can get a patch before RC, we should really consider taking it.
blocking2.0: --- → ?
Flags: blocking1.9.2-
I just crashed in this stack a moment ago, but all I remember is that I was typing rapidly trying to login to a wiki page. http://crash-stats.mozilla.com/report/index/bp-303829d9-d51d-409b-97b7-9b8162091214 is my report ID and I will keep my eye out for another crash.
ranked #18 in 3.5.6 and #11 in 3.6b5. This might be a good one for helping to calibrate progress of crashkill efforts. Its pretty uniform 4.5-5.5 crashes per thousand reports across all releases. checking --- 20091221-crashdata.csv js_TraceObject release total-crashes js_TraceObject crashes pct. all 226457 1032 0.00455716 3.0.15 6122 22 0.0035936 3.0.16 33800 148 0.0043787 3.5.5 16341 71 0.0043449 3.5.6 113209 499 0.00440778 3.6b5 18468 103 0.00557721 3.6b4 4201 19 0.00452273 3.6b3 614 3 0.00488599 3.6b2 662 2 0.00302115 3.6b1 2011 10 0.00497265
Flags: wanted1.9.0.x+
blocking2.0: ? → final
I was able to reproduce this crash 100% using the latest trunk build on Mac: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.3a1pre) Gecko/20100202 Minefield/3.7a1pre STR: 1. Visit http://my.ebay.it/ws/eBayISAPI.dll?MyEbay&gbh=1&CurrentPage=MyeBayWon 2. Crash https://crash-stats.mozilla.com/report/index/0279bcbf-9ebb-4ccd-90a5-643592100202
If the page doesn't crash right away, click on the registration button and that will trigger the crash.
Assignee: general → dmandelin
Depends on: 544160
I spun off comment 10 as bug 544160. Solving that one gives a tiny bit more insight on this bug (js_TraceObject as a topcrash). Bug 544160 is basically a memory corruption bug, where we write where we shouldn't, and then crash in GC later. That bug affects only tip, so it cannot be the cause of any of these crashes in 3.0/3.5/3.6. Thus, the topcrash with this signature is probably a collection of many invalid-write bugs that cause GC crashes, which then show up as a frequent crash with this signature. I did get a couple of random ideas that might help us find these memory corruption bugs: - Run browsers under valgrind a lot. (That's how I solved bug 544160.) - Prepare a browser where we write-protect JS pages when we are not running in JS, and then test that a bunch. It should crash a lot faster, and also give a crash stack that is the original invalid write, rather than an uninformative GC stack.
Assignee: dmandelin → general
re: random ideas - seems like having a suite of tests around garbage collection might help us on diagnosing a number of GC bugs as well. do we have such tests, or could we construct a set that exercises GC under a number of conditions and states? - maybe a meta bug or some flagging of all the GC bugs, and some code review would help to figure out what we might build some test cases around. just trying to run url's that people are on when GC kicks in seems to be unproductive in trying to figure out steps to reproduce.
Whiteboard: [needs STR or input on whether the stack trace is useful in fixing the issue][sg:investigate][crashkill] → [needs STR or input on whether the stack trace is useful in fixing the issue][crashkill]
My daughter can cause a failure within an hour. This is Firefox-3.6.3 on KUbuntu Lucid, latest bits as of 6/7/2010. I got it under gdb and looked around, from my analysis it looks like the object's classword is OK, but the local pointer is hosed. What it points to is junk, and the junk value (which should be a function pointer) is indeed where I found the PC when it died. So somehow in this line of instructions we went from good class pointer in ob to bad class pointer in local variable. Bad news is it smells like one of those hard-to-find bugs. Good news is we can cause it to happen here in pretty short order, and you have a willing hand here to field your debug requests. 5677 /* No one runs while the GC is running, so we can use LOCKED_... here. */ 5678 JSClass *clasp = obj->getClass(); 5679 if (clasp->mark) { 5680 if (clasp->flags & JSCLASS_MARK_IS_TRACE) 5681 ((JSTraceOp) clasp->mark)(trc, obj); 5682 else if (IS_GC_MARKING_TRACER(trc)) 5683 (void) clasp->mark(cx, obj, trc); 5684 } 5685 (gdb) p *obj $34 = {map = 0xb53c9a80, classword = 3086503458, fslots = {-1491617056, -1384822528, -1443671952, 22, 22}, dslots = 0x0} (gdb) p/x 3086503458 $35 = 0xb7f84e22 (gdb) p *(JSClass *)0xb7f84e20 $36 = {name = 0xb7d396f5 "XPCWrappedNative_NoHelper", flags = 590089, addProperty = 0xb73a3cf7 <XPC_WN_OnlyIWrite_PropertyStub>, delProperty = 0xb73a3ce0 <XPC_WN_CannotModifyPropertyStub>, getProperty = 0xb70818c0 <JS_PropertyStub>, setProperty = 0xb73a3cf7 <XPC_WN_OnlyIWrite_PropertyStub>, enumerate = 0xb73a5c2b <XPC_WN_Shared_Enumerate>, resolve = 0xb73a4945 <XPC_WN_NoHelper_Resolve>, convert = 0xb73a5a31 <XPC_WN_Shared_Convert>, finalize = 0xb73a59d5 <XPC_WN_NoHelper_Finalize>, getObjectOps = 0xb73a3b3c <XPC_WN_GetObjectOpsNoCall(JSContext*, JSClass*)>, checkAccess = 0, call = 0, construct = 0, xdrObject = 0, hasInstance = 0, mark = 0xb73a72e1 <XPC_WN_Shared_Trace>, reserveSlots = 0} (gdb) p clasp $37 = (JSClass *) 0xb7fc4e20 (gdb) p *clasp $38 = {name = 0x18c48310 <Address 0x18c48310 out of bounds>, flags = 1583071281, addProperty = 0x90c35d5f, delProperty = 0x26748d, getProperty = 0x57e58955, setProperty = 0x5356d789, enumerate = 0x893cec83, resolve = 0x28bd445, convert = 0xffb402e8, finalize = 0xafc381ff, getObjectOps = 0x890000c1, checkAccess = 0x8b042444, call = 0x489d445, construct = 0xd826e824, xdrObject = 0xf883ffff, hasInstance = 0x89217604, mark = 0xff31243c, reserveSlots = 0xffe4f7e8} Backtrace: #0 0xff31243c in ?? () #1 0xb70ec52c in js_TraceObject (trc=0xbfffad6c, obj=0xa717bb00) at jsobj.cpp:5681 #2 0xb70c7a1e in JS_TraceChildren (trc=0xa717bb00, thing=0xbfffad6c, kind=0) at jsgc.cpp:2384 #3 0xb737dddb in nsXPConnect::Traverse (this=0xb5c1fb30, p=0xa717bb00, cb=...) at nsXPConnect.cpp:914 #4 0xb7c3d8d0 in GCGraphBuilder::Traverse (this=0xbfffaee8, aPtrInfo=0xb2905714) at nsCycleCollector.cpp:1389 #5 0xb7c3d948 in nsCycleCollector::MarkRoots (this=0xb5c2f000, builder=...) at nsCycleCollector.cpp:1611 #6 0xb7c3e10f in nsCycleCollector::BeginCollection (this=0xb5c2f000) at nsCycleCollector.cpp:2554 #7 0xb7c3e174 in nsCycleCollector_beginCollection () at nsCycleCollector.cpp:3141 #8 0xb737e5ac in XPCCycleCollectGCCallback (cx=0xb5c13800, status=JSGC_MARK_END) at nsXPConnect.cpp:390 #9 0xb70c9ca1 in js_GC (cx=0xb5c13800, gckind=GC_NORMAL) at jsgc.cpp:3537 #10 0xb708525b in JS_GC (cx=0xb5c13800) at jsapi.cpp:2439 #11 0xb737df10 in nsXPConnect::Collect (this=0xb5c1fb30) at nsXPConnect.cpp:477 #12 0xb7c3e25e in nsCycleCollector::Collect (this=0xb5c2f000, aTryCollections=1) at nsCycleCollector.cpp:2434 #13 0xb7c3e3bd in nsCycleCollector_collect () at nsCycleCollector.cpp:3129 #14 0xb779fd82 in nsJSContext::CC () at nsJSEnvironment.cpp:3578 #15 0xb779fdea in nsJSContext::IntervalCC () at nsJSEnvironment.cpp:3666 #16 0xb769d6d2 in nsXMLHttpRequest::RequestCompleted (this=0xa9ca2d00) at nsXMLHttpRequest.cpp:2203 #17 0xb769d8e2 in nsXMLHttpRequest::OnStopRequest (this=0xa9ca2d00, request=0xafc8519c, ctxt=0x0, status=0) at nsXMLHttpRequest.cpp:2146 #18 0xb764b3f1 in nsCrossSiteListenerProxy::OnStopRequest (this=0xacc63880, ---Type <return> to continue, or q <return> to quit--- aRequest=0xafc8519c, aContext=0x0, aStatusCode=0) at nsCrossSiteListenerProxy.cpp:334 #19 0xb7402c6b in nsStreamListenerTee::OnStopRequest (this=0xa9b400c0, request=0xafc8519c, context=0x0, status=0) at nsStreamListenerTee.cpp:72 #20 0xb745a545 in nsHttpChannel::OnStopRequest (this=0xafc85170, request=0xb2e271f0, ctxt=0x0, status=0) at nsHttpChannel.cpp:5309 #21 0xb73ecdd2 in nsInputStreamPump::OnStateStop (this=0xb2e271f0) at nsInputStreamPump.cpp:576 #22 0xb73ed08f in nsInputStreamPump::OnInputStreamReady (this=0xb2e271f0, stream=0xa4e873e8) at nsInputStreamPump.cpp:401 #23 0xb7c1d37f in nsInputStreamReadyEvent::Run (this=0xa73e7d00) at nsStreamUtils.cpp:112 #24 0xb7c3234c in nsThread::ProcessNextEvent (this=0xb5ca1790, mayWait=1, result=0xbffff29c) at nsThread.cpp:527 #25 0xb7c00f0f in NS_ProcessNextEvent_P (thread=0xff31243c, mayWait=1) at nsThreadUtils.cpp:250 #26 0xb7b65296 in nsBaseAppShell::Run (this=0xb537cc90) at nsBaseAppShell.cpp:170 #27 0xb7a27a40 in nsAppStartup::Run (this=0xb53ba9a0) at nsAppStartup.cpp:183 #28 0xb7371a84 in XRE_main (argc=1, argv=0xbffff824, aAppData=0xb5c18380) at nsAppRunner.cpp:3506 #29 0xb7ff59c3 in main (argc=1, argv=0xbffff824) at nsBrowserApp.cpp:158
With STR+reverse debugging, we might be able to figure this thing out. I don't have cycles to look at it right now; does someone else?
(In reply to comment #14) With apologies... this system started showing problems in its memory diagnostics. I'm afraid you'll have to consider this a spurious result until I can get back to you with confirmation that it is otherwise reproducible.
the reporter of bp-689213fd-892f-4f79-83e4-714a72100711 writes "I have had frequent crashes of all applications for about a month now. Mainly Firefox and Thunderbird, which are the applications I use the most, but also OpenOffice, Opera, gedit, log file viewer, etc. There are many pages of detail about it, including various error messages, at a thread I started on Ubuntu forums: http://ubuntuforums.org/showthread.php?s=21d9198b37a0818792b5509108542c28&p=9620548#post9620548 "
We are not seeing this in the latest top crash lists for the trunk. Still seems to show up in FF 3.6.10, #15 crash (900 crashes/day). We are removing the final+ for 2.0 and back to nomination. Adding a nomination for 1.9.2.
blocking2.0: final+ → ?
status1.9.2: --- → ?
Still shows up in 3.0.x (removed from the list below) and 3.5.x, and is high volume on 3.6. I looks like it might be dramatically reduced in 4.0b6 and 4.0b8pre. At any rate its not a volume increase on the trunk. I'd say this is a good one to get if we can, but not regression since the last major release. date crashes at js_TraceObject 20101017 1262 total all releases ,885 3.6.10, 52 3.6.8, 50 3.5.13, 39 3.6.3, 35 3.0.19, 28 3.6.6, 28 3.6.11, 17 3.6, 10 3.6.2, 9 4.0b3, 9 3.6b4, 9 3.5.14, 9 3.1b3, 7 3.6.4, 6 4.0b5, 6 3.5.7, 5 4.0b2, 5 4.0b1, 5 3.6.9, 4 3.5.11, *** 3 4.0b6, 3 3.6.7, 3 3.5.3, 3 3.5, 2 3.6b5, 2 3.6b2, 2 3.6b1, 2 3.5.9, 2 3.5.6, 2 3.5.5, *** 1 4.0b8pre, 1 3.5.4, 1 3.5.2, 1 3.5.1, 1 3.1b1,
bc, any thoughs on how we could plugin dmandelin's ideas in comment 12?
hit return to fast... the question is on how we could plug in some of these ideas in comment 12 to our crash automation. this might also help the recent flare up of trunk crashes that might be the same scenario.
I thought we were going to remove the 2.0 nomination flag on this one? Sorry, I can't remember why we were going to leave it in this state.
blocking2.0: ? → beta8+
(In reply to comment #23) > I thought we were going to remove the 2.0 nomination flag on this one? Sorry, I > can't remember why we were going to leave it in this state. Yeah, I don't think it should block either. In fact, I'd WONTFIX it: - It's been around forever. - It's about the hardest possible crash bug to solve. - We will hopefully overhaul the GC next year, which might obviate any fix.
So now it's assigned to beta8 which isn't consistent with comment #24. I expect we will end up with some topcrashers after we release beta7 that are higher priority.
This needs an owner if we want to keep blocking b8 on it.
(In reply to comment #26) > This needs an owner if we want to keep blocking b8 on it. Good point. Unblocking per my reasons in comment 24.
blocking2.0: beta8+ → -
this signature has moved from around #13 rank in 3.6.13 down to #911 in 4.0beta9, so some fixes are working, or there has been a signature shift.
The testcase for bug 624286 can occasionally trigger this crash in 3.6.13. However, it only covers the write violations at 0x0, but it could still be of some use.
Crash Signature: [@ js_TraceObject]
I don't see any of these on a version of FF beyond 4.0.1 in the past 4 weeks. I don't see this on any version of Thunderbird beyond 3.1 either. Only 1 crash on Fennec 4.0b2. Resolving as works for me.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.