Closed Bug 720624 Opened 12 years ago Closed 12 years ago

Grooveshark complains "Flash is no longer responding"

Categories

(Core Graveyard :: Plug-ins, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: philikon, Unassigned)

Details

Attachments

(2 files)

Grooveshark can start a Flash operation (e.g. playing music) just fine, but on the next interaction with Flash (e.g. skipping a song), it complains that "Flash is no longer responding". This is the 2012-01-22 nightly (buildid 20120122031050) running on Ubuntu 11.10.
Seeing this on Mac as well, using 9.0.1. I am using Flash Version: 11.2.202.96.
This happens with Flash V 11.2.202.19 on Mac 10.7 as well, using another machine and running the nightly.  

STR:
1. Load the Grooveshark site
2. Receive the error in the screenshot

Working to see if it happens on Windows as well and will test with different versions of flash on the machines I am seeing it on.
[:philikon]: Which version of Flash are you using?
I can reproduce the same behavior where I get the script and flash errors when I load the site using Shockwave Flash Version: 11.1.102.55 and Mozilla/5.0 (Windows NT 5.1; rv:9.0.1) Gecko/20100101 Firefox/9.0.1. Adding Sal to the bug in case he has any insight into why we are getting the Flash error.
OS: Linux → All
Hardware: x86_64 → All
(In reply to Marcia Knous [:marcia] from comment #3)
> [:philikon]: Which version of Flash are you using?

about:plugins says Shockwave Flash 11.1 r102. It's the version that comes with Ubuntu 11.10.
i'm not able to repro on my Ubuntu 11.10 11,1,102,55 64-bit box.  from the screenshot, i'm not familiar with that dialog box...  the error looks to be from the grooveshark client.
i should note that my current browser version is Firefox 7...
I'm only able to repro sometimes with a debug build.  However, breaking randomly with gdb I think I have a probable explanation for what is going on here.

What is definitely happening is that we have some infinite recursion caused by grooveshark code.  When a stack-overflow exception is thrown, grooveshark code catches the exception and, rather than rethrowing it, continues (which lets the infinite recursion continue).  This continues until the slow script dialog comes up.

So here is how I think we might be causing this:

This code (line 107 in http://static.a.gs-cdn.net/gs/core.js?20120124.01):

u=Array.prototype.slice.call(u,0);

calls into the flash plugin via the get-element hook when slice accesses the elements of u.  The flash plugin (before returning) calls a function (from file http://grooveshark.com):

try{__flash__toXML(GS.Controllers.PlayerController.instance().playerReady(({previousQueue:null,playPauseFade:false,crossfadeAmount:5000,isMuted:false,currentQueue:({shuffleEnabled:false,autoplayEnabled:false,currentAutoplayTagID:0,previousSong:null,repeatMode:0,nextSong:null,activeSong:null,songs:[],queueID:null}),volume:100,crossfadeEnabled:false})));}catch(e){"<undefined/>";}

which is a string constructed from the "flashvars" <param> tag in http://grooveshark.html.  This calls the 'playerReady' function in http://static.a.gs-cdn.net/gs/app.js?20120124.04.

So, it seems like calling 'playerReady' is how the flash plugin notifies client JS that it is ready.  Calling 'playerReady' from a random get-element seems like exactly the type of situation that could break grooveshark's assumptions and trigger an infinite regress.  It also explains the non-determinacy: this only happens if the plugin happens to finish loading at the right time.

To close the loop: 'playerReady' calls 'updateWithLocalSettings' which calls 'setCrossfadEnabled' which calls 'changeLocalSettings' which calls 'a.publish' (line 309 in http://static.a.gs-cdn.net/gs/core.js?20120124.01) which calls the function 'each' (line 21) which calls (via Function.prototype.call) this code:

function(){try{this.apply(a,c||[])}catch(e){console.error("pub/sub. topic: ",b,", error: ",e,"msg:",e.message,"stack:",e.stack,", func: ",this)}

on line 309.  This is where the exception is discarded as mentioned above.  This code calls (via Function.prototype.apply) this code:

function(){for(var l=j.concat(jQuery.makeArray(arguments)),o,q=f.length,t=0,C;t<q;t++)if(C=f[t]){if((o=typeof C=="string")&&h._set_called)h.called=C;l=(o?h[C]:C).apply(h,l||[]);if(t<q-1)l=!jQuery.isArray(l)||l._use_call?[l]:l}return l}

on line 190.  This calls updateWithLocalSettings (back in http://static.a.gs-cdn.net/gs/app.js?20120124.04) and closes the loop.

Now, I don't know anything about NPAPI and our IPC layer, but it seems like we are doing something wrong by letting 'playerReady' jump ahead of the get-element request.  I'll cc some NPAPI-knowing people.

For a quick-fix, perhaps grooveshark could rearrange their code so that all the get-element calls happened after 'playerReady' has fired?
Oops, my cc was dropped.  My question: perhaps one of you know if a fix is needed on our side from the issue described in the above comment?
This is not really something that we can fix: IPC messages can race, and our policy when there is a race is that the plugin always wins the race for various reasons related to how we audited our implementation and how the NPAPI expects to behave.
To be more precise, we *could* fix it by always blocking plugin execution while content JS is running, but that is not practical. Content JS can enter plugins at basically any time, and plugin execution can enter JS at also pretty much any time, so you'd have to deeply synchronize them to prevent any races.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #11)
Ok, so, just to confirm: if content JS does any object operation on a plugin object, it should expect to be reentered (via callbacks from the plugin) while that object operation is executing?

(OOC, it seems like, to fix this, we'd need to queue up messages, waiting for the response to the particular object operation; is this what you meant by "deeply synchronize them"?)
Yes, content JS should expect that calls made by the plugin can happen at any time.

When you say "queue up messages" I'm not sure what you mean. The state we're in is:

BROWSER PROCESS: running event loop, content JS calling into plugin
PLUGIN PROCESS: running event loop, plugin calling into content

In this case you have to interleave the messages in one direction or the other, and in our case the policy is "the plugin side always wins". The only way to fix this is not to run the plugin event loop while content JS is running in the browser, which would destroy performance.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #13)
> In this case you have to interleave the messages in one direction or the
> other, and in our case the policy is "the plugin side always wins". The only
> way to fix this is not to run the plugin event loop while content JS is
> running in the browser, which would destroy performance.

I am not talking here about the outermost browser event loop, I'm talking about a event loop that is spinning in response to a get-element operation on a plugin object (see the attached subset of the C++ stack).  What I meant by 'queue up messages' was that, when NPP_GetValue (in this case) is waiting for a particular response (so that it can return), it defers calls to _evaluate.  I don't know if this breaks some other invariant and I understand this might be necessary, I am just trying to make sure that you understand the particular situation I'm seeing.
Ok, bsmedberg explained on irc: _evaluate has to return a value for the plugin to progress enough to service the request NPP_GetValue is waiting for.  So this is totally by-design and grooveshark needs to fix the problem on their end (see comment 8 for proposal).
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → INVALID
Thanks, everybody, for tracking this down! I tweeted at @GroovesharkHelp, hopefully they'll fix it.
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: