Closed Bug 699974 Opened 13 years ago Closed 13 years ago

scripts evaluated from plugins may prevent the slow script dialog from ever appearing

Categories

(Core :: DOM: Core & HTML, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla11

People

(Reporter: dholbert, Assigned: luke)

References

(Depends on 1 open bug, )

Details

(Keywords: hang, Whiteboard: [Snappy:P1])

Attachments

(2 files, 1 obsolete file)

STEPS TO REPRODUCE:
 1. Visit http://noscript.net/getit and click green "Install" button.
 2. Click through NoScript extension install process; restart Firefox.
 3. Visit http://www.grooveshark.com/
 4. Click [Options] button on NoScript notification bar at the bottom, and click menu entries to allow scripts from gs-cdn.net and grooveshark.com
 5. Click "Login" link at the top
 6. Click the "Google" button to authenticate with your Google account.
 7. Continue through the (free) account-creation process, accepting defaults and providing as little information as it allows.
 8. When you get to the "All Finished" screen at the end of this process, click the [Options] in NoScript footer-bar (or click NoScript icon in toolbar) and choose "Forbid Grooveshark.com".  Or alternately, do Ctrl+Shift+\ (which does the same thing)

ACTUAL RESULTS:
Browser hangs, using up 99% CPU.

If I "kill -11" it, I get a crash report that usually has the signature "[@ js::StackFrame::pcQuadratic ]"

The crash report also often says
> This dump is too long and has triggered the automatic truncation routine
and has many nested instances of this snippet:

>32 	libxul.so 	js::mjit::stubs::UncachedCallHelper 	js/src/methodjit/InvokeHelpers.cpp:479
>33	libxul.so 	js::mjit::ic::Call 	js/src/methodjit/MonoIC.cpp:1099
>34 		@0x7f4b088fa38d 	
>35 	libxul.so 	js::mjit::JaegerShot 	js/src/methodjit/MethodJIT.cpp:1064
>36 	libxul.so 	js::Interpret 	js/src/jsinterp.cpp:3987
>37 	libxul.so 	UncachedInlineCall 	js/src/methodjit/InvokeHelpers.cpp:392
>38 	libxul.so 	js::mjit::stubs::UncachedCallHelper 	js/src/methodjit/InvokeHelpers.cpp:479
[...continues repeating...]

Sample crash report:
 bp-968bdc8d-baa0-4ba8-9acc-1db652111104


NOTE: After you've created a GrooveShark account, you should be able to reproduce this by simply logging into your account (which I do via the "Google" button in the login dialog) and then performing Step 8.
I'm using today's nightly, btw:
Mozilla/5.0 (X11; Linux x86_64; rv:10.0a1) Gecko/20111104 Firefox/10.0a1

Some more crash report IDs:
bp-ce5ea10b-d1ca-439b-8427-4acc42111104
bp-80b71b8c-5d04-459d-85af-3d9f12111104
bp-ad9a0e6f-b57f-4f1d-8192-65c522111104
bp-574a54de-d8e1-4cc3-a566-46af12111104
bp-3ff5daef-9d6f-4c9c-991a-8ace32111104
bp-753acb56-2be9-420a-a004-223c92111104
bp-faba84d2-41fa-4b13-b042-279112111104
bp-83e36f9d-7d0c-4089-b17d-128412111104
bp-edeff872-a252-44d0-8567-15b982111104
bp-767cce32-6faf-443e-bfc2-065b42111104
bp-0f2f488d-0cb1-48df-a2a5-ef4272111104
bp-26f4f8f8-4589-46cb-aad7-fc36d2111104

To reiterate, I manually triggered these crashes with "kill -11" to stop the hang, so don't take the exact crash-location too seriously.  The places where I most frequently interrupt it seem to be [@ js::mjit::ExpandInlineFrames ] and [@ js::StackFrame::pcQuadratic ] though.

Also: I ran through the STR with a fresh profile and a freshly-created grooveshark account and was able to hit this, so hopefully it should be reproducible for others as well.  (at least, if they use a Google account -- I haven't tried with a non-Google-account-associated grooveshark account, so I can't speak to that)

This bug is 100% reproducible for me, on my 64-bit laptop running Ubuntu 11.10.
Crash Signature: [@ js::StackFrame::pcQuadratic ] [@ js::mjit::ExpandInlineFrames ]
Keywords: hang
(In reply to Daniel Holbert [:dholbert] from comment #0)
>  7. Continue through the (free) account-creation process, accepting defaults
> and providing as little information as it allows.

(for the record: I had to create a Grooveshark password and specify my (fake) name & birthday. I think that was all.)
OK, I've confirmed that you don't need a Google account to trigger this.  Any Grooveshark account (created at http://grooveshark.com/#/signup , with e.g. anyemailaddress@mailinator.com ) will do.

(also, if it's not already clear, you do need to be logged in to Grooveshark account to trigger this.  otherwise there's no hang.)
Doesn't appear to be a recent regression -- I also hit this in the current Firefox release:
Mozilla/5.0 (X11; Linux x86_64; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
bp-71b105c9-29fe-498d-a389-dc5142111104
Great STR, thanks dholbert!  I spent a little bit of time looking at this.  There is something very weird going on.

First of all, the exception handling code isn't what is causing the browser to hang: each individual call to the exception junk in dholbert's callstacks returns to its caller in JS quickly enough.

The problem seems to be that the JS operation callback (slow script dialog) is not getting triggered.  Here's where it gets weird: if I ctrl-c, break on nsJSContext::DOMOperationCallback, it gets hit immediately and returns (b/c callbackTime == 0).  Continuing several time shows an increasing 'duration' and eventually we do decide to throw up the prompt and the browser recovers when I click 'stop script'.  *BUT* this only happens once I ctrl-c; if I don't ctrl-c the browser just hangs.  That means something is preventing the slow script dialog and then stopping after the ctrl-c signal handler.

Two clues point to the flash plugin:
 1. when I ctrl-c (which allows the slow script dialog to appear) and 'stop script', I get the "The Adobe Flash plugin has crashed".
 2. when I disable all jits in about:config and ctrl-c, the backtrace is *always*:
    #0  pthread_cond_timedwait@@GLIBC_2.3.2 ()
    #1  pt_TimedWait
    #2  PR_WaitCondVar (cvar=0x7fffc0437600, timeout=45000)
    #3  Wait
    #5  mozilla::ipc::SyncChannel::WaitForNotify
    #6  mozilla::ipc::RPCChannel::Call
    #7  mozilla::plugins::PPluginScriptableObjectParent::CallInvoke
    #8  ScriptableInvoke
    #9  mozilla::plugins::PluginScriptableObjectParent::ScriptableInvoke
    #10 CallNPMethodInternal
    #11 CallJSNative
    #12 js::InvokeKernel
    #13 js::Interpret (script->filename = a grooveshark url)
  Continuing from this point always shows the "Adobe Flash has crashed" immediately and the browser recovers.

I saw that timeout = 45 seconds, so I changed dom.ipc.plugins.timeoutSecs = 2 seconds, but this didn't make any difference.

jst: any ideas or hints?

If nothing else, I can start using printf() debugging to see what is preventing DOMOperationCallback from being called (or reseting the timer) before I ctrl-c.  The STR are rather extenuating but I'm thinking perhaps this is a source of real flash-plugin hangs.
Ah, so it seems that the DOMOperationCallback is getting fired at the right time, however the mOperationCallbackTime is getting reset immediately by a JS script evaluated by flash called from the outer ilooping script.  (This explains why, when flash died, the slow script dialog was allowed to emerge.)  I would have thought that we would only do this when the outermost executing script finished executing (mExecuteDepth == 0), but nsJSContext::EvaluateStringWithValue calls nsJSContext::ScriptEvaluated with aTerminated unconditionally true which resets the callback time always.

This could seriously be causing some hangs.  Moving over to DOM for more informed folk to attack.

For reference, here is the callstack of ScriptEvaluated:

#0  ScriptEvaluated
#1  nsJSContext::ScriptEvaluated (aTerminated=true)
#2  EvaluateStringWithValue (aURL="http://grooveshark.com/#/onetwothreefour32")
#3  nsJSContext::EvaluateStringWithValue
#4  mozilla::plugins::parent::_evaluate
#5  mozilla::plugins::PluginScriptableObjectParent::AnswerNPN_Evaluate
#6  mozilla::plugins::PPluginScriptableObjectParent::OnCallReceived
#7  mozilla::plugins::PPluginModuleParent::OnCallReceived
#8  mozilla::ipc::RPCChannel::DispatchIncall
#9  mozilla::ipc::RPCChannel::Call
#10 mozilla::plugins::PPluginScriptableObjectParent::CallInvoke
#11 ScriptableInvoke
#12 mozilla::plugins::PluginScriptableObjectParent::ScriptableInvoke
#13 CallNPMethodInternal
#14 CallJSNative
#15 InvokeKernel
#16 js::Interpret (<--- grooveshark content JS)
Assignee: general → nobody
Crash Signature: [@ js::StackFrame::pcQuadratic ] [@ js::mjit::ExpandInlineFrames ]
Component: JavaScript Engine → DOM
QA Contact: general → general
Summary: hang when disabling scripts at grooveshark.com with apparent infinite recursion inside of [@ js::StackFrame::pcQuadratic ] → scripts evaluated from plugins may prevent the slow script dialog from ever appearing
While I was in the vicinity...
Attachment #572846 - Flags: review?(peterv)
Comment on attachment 572846 [details] [diff] [review]
remove DOMOperationCallback silliness

I knew I could blame gal for this!
Attachment #572846 - Flags: review?(peterv) → review+
So for fun I just tried this:

@@ -3086,17 +3086,17 @@ nsJSContext::ScriptEvaluated(bool aTermi
-  if (aTerminated) {
+  if (aTerminated && mExecuteDepth == 0) {
     mOperationCallbackTime = 0;
     mModalStateTime = 0;
   }

but it seems that mExecuteDepth is actually zero for some reason.  However, this works:

   if (aTerminated && mExecuteDepth == 0 && !JS_IsRunning(mContext)) {

and the timer advances DOMOperationCallback gets to GetPromptFromContext... which fails and  allows script execution to continue (bug 670183)!  Since we have exhausted the C stack (not VM stack, which has a buffer reserved for trusted JS) I suspect we fail a JS_CHECK_RECURSION.  This suggests adding the same "chrome JS gets extra stack space" exception to the C stack check as we do with the VM Stack.

Changing failure to return JS_FALSE (which I still think we should do!) finally kills the script.

Another weird thing I noticed is that, when content spends a lot of time in ConsoleAPI.js (which is happening here), the operation callback often happens in chrome JS and thus the chrome JS timeout is used instead of content.
Depends on: 670183
Blocks: 705695
Whiteboard: [Snappy]
Marking as [Snappy:P1] because it is blocking another [Snappy:P1], and assigning it to Luke because he already has a reviewed patch for it. ;)
Assignee: nobody → luke
Whiteboard: [Snappy] → [Snappy:P1]
Another instance of this problem was reported in bug 696018 (click any video stream in http://www.regame.tv/live).
regame.tv shows yet another problem: nsJSContext::EvaluateWithValue ignores the 'false' return value of JS_EvaluateUCScriptForPrincipalsVersion and returns NS_OK.  Returning NS_ERROR_FAILURE (and with the mExecuteDepth fix in comment 9) we recover after the slow-script dialog.
Attached patch naive fix (obsolete) — Splinter Review
Well, I'll take a stab.  Here are the three problems I've identified.  It appears grooveshark has since change their site so that the STR no longer cause a hang w/o the patch, but if I read my previous comments correctly, this patch would have fixed it.  Pushing to try.
Attachment #578770 - Flags: review?(peterv)
Seems green on try.
Attachment #578770 - Flags: review?(peterv) → review?(jst)
This seems like the right thing to do, but I think we should make the change you made in EvaluateStringWithValue also in EvaluateString() and ExecuteScript(). If you don't agree let me know why and I'll review the patch as is, but it seems to me we want the other changes...
Thanks jst!  I added the extra cases you mentioned.
Attachment #578770 - Attachment is obsolete: true
Attachment #578770 - Flags: review?(jst)
Attachment #581801 - Flags: review?(jst)
Comment on attachment 581801 [details] [diff] [review]
don't drop non-throwing failures

Looks good, r=jst
Attachment #581801 - Flags: review?(jst) → review+
https://hg.mozilla.org/mozilla-central/rev/d737ffa1f92d
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Depends on: 720289
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: