Last Comment Bug 699974 - scripts evaluated from plugins may prevent the slow script dialog from ever appearing
: scripts evaluated from plugins may prevent the slow script dialog from ever a...
Status: RESOLVED FIXED
[Snappy:P1]
: hang
Product: Core
Classification: Components
Component: DOM (show other bugs)
: Trunk
: x86_64 Linux
: -- normal (vote)
: mozilla11
Assigned To: Luke Wagner [:luke]
:
Mentors:
http://www.grooveshark.com
: 696018 (view as bug list)
Depends on: 670183 720289
Blocks: 705695
  Show dependency treegraph
 
Reported: 2011-11-04 17:13 PDT by Daniel Holbert [:dholbert]
Modified: 2012-02-20 16:48 PST (History)
19 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
remove DOMOperationCallback silliness (1.17 KB, patch)
2011-11-08 09:15 PST, Luke Wagner [:luke]
peterv: review+
Details | Diff | Splinter Review
naive fix (2.52 KB, patch)
2011-12-02 16:41 PST, Luke Wagner [:luke]
no flags Details | Diff | Splinter Review
don't drop non-throwing failures (4.23 KB, patch)
2011-12-14 14:53 PST, Luke Wagner [:luke]
jst: review+
Details | Diff | Splinter Review

Description Daniel Holbert [:dholbert] 2011-11-04 17:13:52 PDT
STEPS TO REPRODUCE:
 1. Visit http://noscript.net/getit and click green "Install" button.
 2. Click through NoScript extension install process; restart Firefox.
 3. Visit http://www.grooveshark.com/
 4. Click [Options] button on NoScript notification bar at the bottom, and click menu entries to allow scripts from gs-cdn.net and grooveshark.com
 5. Click "Login" link at the top
 6. Click the "Google" button to authenticate with your Google account.
 7. Continue through the (free) account-creation process, accepting defaults and providing as little information as it allows.
 8. When you get to the "All Finished" screen at the end of this process, click the [Options] in NoScript footer-bar (or click NoScript icon in toolbar) and choose "Forbid Grooveshark.com".  Or alternately, do Ctrl+Shift+\ (which does the same thing)

ACTUAL RESULTS:
Browser hangs, using up 99% CPU.

If I "kill -11" it, I get a crash report that usually has the signature "[@ js::StackFrame::pcQuadratic ]"

The crash report also often says
> This dump is too long and has triggered the automatic truncation routine
and has many nested instances of this snippet:

>32 	libxul.so 	js::mjit::stubs::UncachedCallHelper 	js/src/methodjit/InvokeHelpers.cpp:479
>33	libxul.so 	js::mjit::ic::Call 	js/src/methodjit/MonoIC.cpp:1099
>34 		@0x7f4b088fa38d 	
>35 	libxul.so 	js::mjit::JaegerShot 	js/src/methodjit/MethodJIT.cpp:1064
>36 	libxul.so 	js::Interpret 	js/src/jsinterp.cpp:3987
>37 	libxul.so 	UncachedInlineCall 	js/src/methodjit/InvokeHelpers.cpp:392
>38 	libxul.so 	js::mjit::stubs::UncachedCallHelper 	js/src/methodjit/InvokeHelpers.cpp:479
[...continues repeating...]

Sample crash report:
 bp-968bdc8d-baa0-4ba8-9acc-1db652111104


NOTE: After you've created a GrooveShark account, you should be able to reproduce this by simply logging into your account (which I do via the "Google" button in the login dialog) and then performing Step 8.
Comment 1 Daniel Holbert [:dholbert] 2011-11-04 17:22:32 PDT
I'm using today's nightly, btw:
Mozilla/5.0 (X11; Linux x86_64; rv:10.0a1) Gecko/20111104 Firefox/10.0a1

Some more crash report IDs:
bp-ce5ea10b-d1ca-439b-8427-4acc42111104
bp-80b71b8c-5d04-459d-85af-3d9f12111104
bp-ad9a0e6f-b57f-4f1d-8192-65c522111104
bp-574a54de-d8e1-4cc3-a566-46af12111104
bp-3ff5daef-9d6f-4c9c-991a-8ace32111104
bp-753acb56-2be9-420a-a004-223c92111104
bp-faba84d2-41fa-4b13-b042-279112111104
bp-83e36f9d-7d0c-4089-b17d-128412111104
bp-edeff872-a252-44d0-8567-15b982111104
bp-767cce32-6faf-443e-bfc2-065b42111104
bp-0f2f488d-0cb1-48df-a2a5-ef4272111104
bp-26f4f8f8-4589-46cb-aad7-fc36d2111104

To reiterate, I manually triggered these crashes with "kill -11" to stop the hang, so don't take the exact crash-location too seriously.  The places where I most frequently interrupt it seem to be [@ js::mjit::ExpandInlineFrames ] and [@ js::StackFrame::pcQuadratic ] though.

Also: I ran through the STR with a fresh profile and a freshly-created grooveshark account and was able to hit this, so hopefully it should be reproducible for others as well.  (at least, if they use a Google account -- I haven't tried with a non-Google-account-associated grooveshark account, so I can't speak to that)

This bug is 100% reproducible for me, on my 64-bit laptop running Ubuntu 11.10.
Comment 2 Daniel Holbert [:dholbert] 2011-11-04 17:24:43 PDT
(In reply to Daniel Holbert [:dholbert] from comment #0)
>  7. Continue through the (free) account-creation process, accepting defaults
> and providing as little information as it allows.

(for the record: I had to create a Grooveshark password and specify my (fake) name & birthday. I think that was all.)
Comment 3 Daniel Holbert [:dholbert] 2011-11-04 17:33:52 PDT
OK, I've confirmed that you don't need a Google account to trigger this.  Any Grooveshark account (created at http://grooveshark.com/#/signup , with e.g. anyemailaddress@mailinator.com ) will do.

(also, if it's not already clear, you do need to be logged in to Grooveshark account to trigger this.  otherwise there's no hang.)
Comment 4 Daniel Holbert [:dholbert] 2011-11-04 17:36:41 PDT
Doesn't appear to be a recent regression -- I also hit this in the current Firefox release:
Mozilla/5.0 (X11; Linux x86_64; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
bp-71b105c9-29fe-498d-a389-dc5142111104
Comment 5 Luke Wagner [:luke] 2011-11-07 19:07:16 PST
Great STR, thanks dholbert!  I spent a little bit of time looking at this.  There is something very weird going on.

First of all, the exception handling code isn't what is causing the browser to hang: each individual call to the exception junk in dholbert's callstacks returns to its caller in JS quickly enough.

The problem seems to be that the JS operation callback (slow script dialog) is not getting triggered.  Here's where it gets weird: if I ctrl-c, break on nsJSContext::DOMOperationCallback, it gets hit immediately and returns (b/c callbackTime == 0).  Continuing several time shows an increasing 'duration' and eventually we do decide to throw up the prompt and the browser recovers when I click 'stop script'.  *BUT* this only happens once I ctrl-c; if I don't ctrl-c the browser just hangs.  That means something is preventing the slow script dialog and then stopping after the ctrl-c signal handler.

Two clues point to the flash plugin:
 1. when I ctrl-c (which allows the slow script dialog to appear) and 'stop script', I get the "The Adobe Flash plugin has crashed".
 2. when I disable all jits in about:config and ctrl-c, the backtrace is *always*:
    #0  pthread_cond_timedwait@@GLIBC_2.3.2 ()
    #1  pt_TimedWait
    #2  PR_WaitCondVar (cvar=0x7fffc0437600, timeout=45000)
    #3  Wait
    #5  mozilla::ipc::SyncChannel::WaitForNotify
    #6  mozilla::ipc::RPCChannel::Call
    #7  mozilla::plugins::PPluginScriptableObjectParent::CallInvoke
    #8  ScriptableInvoke
    #9  mozilla::plugins::PluginScriptableObjectParent::ScriptableInvoke
    #10 CallNPMethodInternal
    #11 CallJSNative
    #12 js::InvokeKernel
    #13 js::Interpret (script->filename = a grooveshark url)
  Continuing from this point always shows the "Adobe Flash has crashed" immediately and the browser recovers.

I saw that timeout = 45 seconds, so I changed dom.ipc.plugins.timeoutSecs = 2 seconds, but this didn't make any difference.

jst: any ideas or hints?

If nothing else, I can start using printf() debugging to see what is preventing DOMOperationCallback from being called (or reseting the timer) before I ctrl-c.  The STR are rather extenuating but I'm thinking perhaps this is a source of real flash-plugin hangs.
Comment 6 Luke Wagner [:luke] 2011-11-08 09:06:11 PST
Ah, so it seems that the DOMOperationCallback is getting fired at the right time, however the mOperationCallbackTime is getting reset immediately by a JS script evaluated by flash called from the outer ilooping script.  (This explains why, when flash died, the slow script dialog was allowed to emerge.)  I would have thought that we would only do this when the outermost executing script finished executing (mExecuteDepth == 0), but nsJSContext::EvaluateStringWithValue calls nsJSContext::ScriptEvaluated with aTerminated unconditionally true which resets the callback time always.

This could seriously be causing some hangs.  Moving over to DOM for more informed folk to attack.

For reference, here is the callstack of ScriptEvaluated:

#0  ScriptEvaluated
#1  nsJSContext::ScriptEvaluated (aTerminated=true)
#2  EvaluateStringWithValue (aURL="http://grooveshark.com/#/onetwothreefour32")
#3  nsJSContext::EvaluateStringWithValue
#4  mozilla::plugins::parent::_evaluate
#5  mozilla::plugins::PluginScriptableObjectParent::AnswerNPN_Evaluate
#6  mozilla::plugins::PPluginScriptableObjectParent::OnCallReceived
#7  mozilla::plugins::PPluginModuleParent::OnCallReceived
#8  mozilla::ipc::RPCChannel::DispatchIncall
#9  mozilla::ipc::RPCChannel::Call
#10 mozilla::plugins::PPluginScriptableObjectParent::CallInvoke
#11 ScriptableInvoke
#12 mozilla::plugins::PluginScriptableObjectParent::ScriptableInvoke
#13 CallNPMethodInternal
#14 CallJSNative
#15 InvokeKernel
#16 js::Interpret (<--- grooveshark content JS)
Comment 7 Luke Wagner [:luke] 2011-11-08 09:15:54 PST
Created attachment 572846 [details] [diff] [review]
remove DOMOperationCallback silliness

While I was in the vicinity...
Comment 8 Peter Van der Beken [:peterv] - away till Aug 1st 2011-11-08 11:24:37 PST
Comment on attachment 572846 [details] [diff] [review]
remove DOMOperationCallback silliness

I knew I could blame gal for this!
Comment 9 Luke Wagner [:luke] 2011-11-08 19:01:13 PST
So for fun I just tried this:

@@ -3086,17 +3086,17 @@ nsJSContext::ScriptEvaluated(bool aTermi
-  if (aTerminated) {
+  if (aTerminated && mExecuteDepth == 0) {
     mOperationCallbackTime = 0;
     mModalStateTime = 0;
   }

but it seems that mExecuteDepth is actually zero for some reason.  However, this works:

   if (aTerminated && mExecuteDepth == 0 && !JS_IsRunning(mContext)) {

and the timer advances DOMOperationCallback gets to GetPromptFromContext... which fails and  allows script execution to continue (bug 670183)!  Since we have exhausted the C stack (not VM stack, which has a buffer reserved for trusted JS) I suspect we fail a JS_CHECK_RECURSION.  This suggests adding the same "chrome JS gets extra stack space" exception to the C stack check as we do with the VM Stack.

Changing failure to return JS_FALSE (which I still think we should do!) finally kills the script.

Another weird thing I noticed is that, when content spends a lot of time in ConsoleAPI.js (which is happening here), the operation callback often happens in chrome JS and thus the chrome JS timeout is used instead of content.
Comment 10 Andrew McCreight [:mccr8] 2011-12-01 13:42:17 PST
Marking as [Snappy:P1] because it is blocking another [Snappy:P1], and assigning it to Luke because he already has a reviewed patch for it. ;)
Comment 11 Luke Wagner [:luke] 2011-12-02 14:39:12 PST
*** Bug 696018 has been marked as a duplicate of this bug. ***
Comment 12 Luke Wagner [:luke] 2011-12-02 14:40:34 PST
Another instance of this problem was reported in bug 696018 (click any video stream in http://www.regame.tv/live).
Comment 13 Luke Wagner [:luke] 2011-12-02 15:30:10 PST
regame.tv shows yet another problem: nsJSContext::EvaluateWithValue ignores the 'false' return value of JS_EvaluateUCScriptForPrincipalsVersion and returns NS_OK.  Returning NS_ERROR_FAILURE (and with the mExecuteDepth fix in comment 9) we recover after the slow-script dialog.
Comment 14 Luke Wagner [:luke] 2011-12-02 16:41:36 PST
Created attachment 578770 [details] [diff] [review]
naive fix

Well, I'll take a stab.  Here are the three problems I've identified.  It appears grooveshark has since change their site so that the STR no longer cause a hang w/o the patch, but if I read my previous comments correctly, this patch would have fixed it.  Pushing to try.
Comment 15 Luke Wagner [:luke] 2011-12-03 21:49:46 PST
Seems green on try.
Comment 16 Johnny Stenback (:jst, jst@mozilla.com) 2011-12-14 13:12:10 PST
This seems like the right thing to do, but I think we should make the change you made in EvaluateStringWithValue also in EvaluateString() and ExecuteScript(). If you don't agree let me know why and I'll review the patch as is, but it seems to me we want the other changes...
Comment 17 Luke Wagner [:luke] 2011-12-14 14:53:14 PST
Created attachment 581801 [details] [diff] [review]
don't drop non-throwing failures

Thanks jst!  I added the extra cases you mentioned.
Comment 18 Johnny Stenback (:jst, jst@mozilla.com) 2011-12-14 16:12:16 PST
Comment on attachment 581801 [details] [diff] [review]
don't drop non-throwing failures

Looks good, r=jst

Note You need to log in before you can comment on or make changes to this bug.