Closed Bug 588128 Opened 14 years ago Closed 13 years ago

Tracer Assertion and Slowness in V8 Benchmark - MessageManager related

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX
Tracking Status
blocking2.0 --- -
fennec - ---

People

(Reporter: azakai, Unassigned)

References

Details

Currently on mozilla-central running the V8 benchmark in a debug build will crash in an assertion (stack pasted below). In addition the benchmark runs more slowly (perhaps because of leaving trace?). Side issue: The slowness was noticed on Talos, which led to testing a debug build locally, which uncovered the crash - should we perhaps run debug builds on tinderbox, to test for such things? The crash is related to the messageManager code, apparently - commenting out the loadFrameScript() call in the recently-pushed patch to bug 552828 will solve this. Bug 550936 also uses loadFrameScript and has already been backed out because of this problem. Note that running loadFrameScript to load an empty script also triggers this bug (so the content of the frame scripts is irrelevant). (gdb) where #0 0xb7fe3832 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0xb7fc4230 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42 #2 0xb7515dfd in JS_Assert (s=0xb7c471e8 "oldInlineCallCount == inlineCallCount", file=0xb7c444d8 "/scratchbox/users/alon/home/alon/mozilla-central/js/src/jstracer.cpp", ln=6276) at /scratchbox/users/alon/home/alon/mozilla-central/js/src/jsutil.cpp:83 #3 0xb7549468 in js::TraceRecorder::attemptTreeCall (this=0xa4710800, f=0xa3107b34, inlineCallCount=@0xbfffe520) at /scratchbox/users/alon/home/alon/mozilla-central/js/src/jstracer.cpp:6276 #4 0xb7549146 in js::TraceRecorder::recordLoopEdge (cx=0xac9cd400, r=0xa4710800, inlineCallCount=@0xbfffe520) at /scratchbox/users/alon/home/alon/mozilla-central/js/src/jstracer.cpp:6188 #5 0xb754b535 in js::MonitorLoopEdge (cx=0xac9cd400, inlineCallCount=@0xbfffe520, reason=js::Record_Branch) at /scratchbox/users/alon/home/alon/mozilla-central/js/src/jstracer.cpp:7142 #6 0xb75f50b6 in js::Interpret (cx=0xac9cd400) at /scratchbox/users/alon/home/alon/mozilla-central/js/src/jsinterp.cpp:2725 #7 0xb747c142 in InvokeCommon<JSBool (*)(JSContext*, JSObject*, uintN, js::Value*, js::Value*)> (cx=0xac9cd400, fun=0xa8d49510, script=0xa905e800, native=0, argsRef=..., flags=0) at /scratchbox/users/alon/home/alon/mozilla-central/js/src/jsinterp.cpp:572 #8 0xb747965b in js::Invoke (cx=0xac9cd400, args=..., flags=0) at /scratchbox/users/alon/home/alon/mozilla-central/js/src/jsinterp.cpp:694 #9 0xb747987c in js::InternalInvoke (cx=0xac9cd400, thisv=..., fval=..., flags=0, argc=1, argv=0xad7cb990, rval=0xbfffe7d0) at /scratchbox/users/alon/home/alon/mozilla-central/js/src/jsinterp.cpp:734 #10 0xb73f148a in InternalCall (cx=0xac9cd400, obj=0xa8d35f90, fval=..., argc=1, argv=0xad7cb990, rval=0xbfffe7d0) at /scratchbox/users/alon/home/alon/mozilla-central/js/src/jsinterp.h:419 #11 0xb740b37b in JS_CallFunctionValue (cx=0xac9cd400, obj=0xa8d35f90, fval=..., argc=1, argv=0xad7cb990, rval=0xbfffe7d0) at /scratchbox/users/alon/home/alon/mozilla-central/js/src/jsapi.cpp:4852 #12 0xb652bf82 in nsJSContext::CallEventHandler (this=0xac83cf80, aTarget=0xa904e380, aScope=0xa8d35f90, aHandler=0xa0857d80, aargv=0xa0791f24, arv=0xbfffe928) at /scratchbox/users/alon/home/alon/mozilla-central/dom/base/nsJSEnvironment.cpp:2248 #13 0xb6565d7a in nsGlobalWindow::RunTimeout (this=0xa904e380, aTimeout=0xa9bbf600) at /scratchbox/users/alon/home/alon/mozilla-central/dom/base/nsGlobalWindow.cpp:8553 #14 0xb6566b1a in nsGlobalWindow::TimerCallback (aTimer=0xa9bbf650, aClosure=0xa9bbf600) at /scratchbox/users/alon/home/alon/mozilla-central/dom/base/nsGlobalWindow.cpp:8898 #15 0xb722fb59 in nsTimerImpl::Fire (this=0xa9bbf650) at /scratchbox/users/alon/home/alon/mozilla-central/xpcom/threads/nsTimerImpl.cpp:425 #16 0xb722fdd3 in nsTimerEvent::Run (this=0xa0791f40) at /scratchbox/users/alon/home/alon/mozilla-central/xpcom/threads/nsTimerImpl.cpp:517 #17 0xb7228a96 in nsThread::ProcessNextEvent (this=0xb44e0d30, mayWait=0, result=0xbfffeb7c) at /scratchbox/users/alon/home/alon/mozilla-central/xpcom/threads/nsThread.cpp:547 #18 0xb71b4735 in NS_ProcessNextEvent_P (thread=0xb44e0d30, mayWait=0) at nsThreadUtils.cpp:250 #19 0xb704d238 in mozilla::ipc::MessagePump::Run (this=0xb44d67c0, aDelegate=0xb4423980) at /scratchbox/users/alon/home/alon/mozilla-central/ipc/glue/MessagePump.cpp:118 #20 0xb728d1b7 in MessageLoop::RunInternal (this=0xb4423980) at /scratchbox/users/alon/home/alon/mozilla-central/ipc/chromium/src/base/message_loop.cc:219 #21 0xb728d137 in MessageLoop::RunHandler (this=0xb4423980) at /scratchbox/users/alon/home/alon/mozilla-central/ipc/chromium/src/base/message_loop.cc:202 #22 0xb728d0db in MessageLoop::Run (this=0xb4423980) at /scratchbox/users/alon/home/alon/mozilla-central/ipc/chromium/src/base/message_loop.cc:176 #23 0xb6ef0986 in nsBaseAppShell::Run (this=0xb14f53d0) at /scratchbox/users/alon/home/alon/mozilla-central/widget/src/xpwidgets/nsBaseAppShell.cpp:175 #24 0xb6c44d15 in nsAppStartup::Run (this=0xb13a03d0) at /scratchbox/users/alon/home/alon/mozilla-central/toolkit/components/startup/src/nsAppStartup.cpp:191 #25 0xb5c68ae2 in XRE_main (argc=3, argv=0xbffff324, aAppData=0xb4410380) at /scratchbox/users/alon/home/alon/mozilla-central/toolkit/xre/nsAppRunner.cpp:3659 #26 0x08049af7 in main (argc=3, argv=0xbffff324) at /scratchbox/users/alon/home/alon/mozilla-central/browser/app/nsBrowserApp.cpp:158
blocking2.0: --- → ?
tracking-fennec: --- → ?
Blocks: 550936, 552828
tracking-fennec: ? → 2.0a1+
blocking2.0: ? → betaN+
The first user of the message manager (bug 552828) was form history. When we landed this, it cause a pretty huge regression in ss (bug 588057). This bug may have also caused the pref regression causing us to back out the InstallTrigger patch (bug 550936)
We don't take performance regressions.
(In reply to comment #0) > Side issue: The slowness was noticed on Talos, which led to testing a debug > build locally, which uncovered the crash - should we perhaps run debug builds > on tinderbox, to test for such things? We do run debug builds on Tinderbox. We don't run Talos on debug builds. Are dromaeo/ss only run as part of Talos? Seems like they should be part of the conformance suite too.
Here's a link to a v8 hit on Talos on the e10s tree: http://bit.ly/cbzHcY You guys should have RelEng set up nagmail like the stuff that gets sent to mozilla.dev.tree-management. The TM tree has it, but it just gets sent to my mail account. It's not a good idea to have more than one tree reporting those things to the same mailing list, so you should probably find a dedicated address or newsgroup for it.
(In reply to comment #3) > (In reply to comment #0) > > Side issue: The slowness was noticed on Talos, which led to testing a debug > > build locally, which uncovered the crash - should we perhaps run debug builds > > on tinderbox, to test for such things? > > We do run debug builds on Tinderbox. We don't run Talos on debug builds. Yes, I meant to say, if we had run debug builds on benchmarks on try, this would have been noticed earlier. Obviously Talos on a debug build isn't useful for performance, but it can catch assertions.
blocking2.0: betaN+ → ?
tracking-fennec: 2.0a1+ → ?
Disabling the JIT in nsInProcessTabChildGlobal (where messageManager code creates a context) does not help with this.
blocking2.0: ? → betaN+
tracking-fennec: ? → 2.0a1+
Depends on: 588201
No longer depends on: 588201
The patch in bug 588201 greatly reduces the chance of this assertion appearing. So perhaps that can help here.
No longer blocks: 550936
seems like we have a workaround in bug 588201, does this still need to block the alpha?
not sure if this is a major problem. it looks like reducing the cache size stresses the jit. I'd say we'd ship alpha without any fix to this specific problem assuming the workaround (not really a work around per se) addresses the problem completely.
clearing the blocking flag for fennec 2.0a1
tracking-fennec: 2.0a1+ → ?
tracking-fennec: ? → 2.0+
It isn't clear that this is a blocking bug. Please renom if you disagree.
blocking2.0: betaN+ → -
tracking-fennec: 2.0+ → 2.0-
Whiteboard: [fennec-4.1?]
Alon, is this still an issue? If so, can you estimate the perf impact and how long it will take?
tracking-fennec: - → ?
Whiteboard: [fennec-4.1?]
I see no noticeable problem here, after the workaround. There might still be an underlying problem that the workaround covers up, though, but I really don't know.
tracking-fennec: ? → -
Obsolete with the removal of tracejit.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.