Closed Bug 550322 Opened 14 years ago Closed 14 years ago

[OOPP] Hulu full screen deadlock protection doesn't trap all cases

Categories

(Core Graveyard :: Plug-ins, defect)

x86
Windows 7
defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: jimm, Assigned: jimm)

References

Details

Attachments

(5 files, 5 obsolete files)

This is a spin off from bug 547142. In that bug, we fixed a very common lock up problem where:

1) child send sync wm_activate message to parent
2) parent handles wm_activate via defwndproc
3) defwndproc processes wm_activate, generating ime event in parent
4) ime events end up in plugininstanceparent -> ipc -> child
child deadlocks

This wasn't general enough. This can also happen:

1) child sends sync wm_activate to parent via ShowWindow
2) parent handles wm_activate via defwndproc
3) defwndproc processes wm_activate, generating a sync event back to child
4) child handles event, lands in an ipc related call back to parent
parent deadlocks

One way to fix this is to use ReplyMessage in parent when we receive wm_activate, freeing the child before step 3 occurs. We did that in the first patch, but sometimes too late. If the child ends up making ipc related calls while processing sync messages from parent, we get another deadlock.   

I think we can back out the patch in bug 547142, and replace it with the patch I'm posting here.

There's a couple open questions here - this patch can either implement a check of the window loosing focus, so that ReplyMessage is only used in specific cases. (There might be additional problems though with other plugins.) Or we can apply this generally and deal with any side effects related to calling ReplyMessage generically.

Another possible fix - hook ShowWindow in child, and send it over ipc to parent. I've put together a basic hook routine that does this but there are some issues to work out related to single hook / multiple instances.
Attached patch replymessage patch (obsolete) — Splinter Review
Assignee: nobody → jmathies
Blocks: LorentzBeta1
Attached patch combined patchSplinter Review
Test patch that removed the original ime replymessage work and replaces it with the more generic use of replymessage for wm_activate.

try builds with this applied -
http://build.mozilla.org/tryserver-builds/jmathies@mozilla.com-try-4734c370776a
Attachment #430423 - Attachment is obsolete: true
After anywhere from to 30-40 time bouncing in/out of full-screen, I cannot repo the hang using the try-build. 

Others may want to also weigh in...
Works very nicely with the try build.
It looks like it works for me too.  I have not been able to reproduce yet.  Win7
Attachment #430639 - Flags: review?(bent.mozilla)
Comment on attachment 430639 [details] [diff] [review]
combined patch

Cool, let's do it!
Attachment #430639 - Flags: review?(bent.mozilla) → review+
http://hg.mozilla.org/mozilla-central/rev/8ca69988390f
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Verifying fixed using build:

20100306111944 28e3a378126d  <- latest hourly with fix.
Status: RESOLVED → VERIFIED
Jim, I think I found another bug.. if you go to Fullscreen on Hulu, click not on the video controls to pause the video, but somewhere in the upperleft corner of the fullscreen video itself, it will try and exit full screen and completely lockup.
(In reply to comment #9)
> Jim, I think I found another bug.. if you go to Fullscreen on Hulu, click not
> on the video controls to pause the video, but somewhere in the upperleft corner
> of the fullscreen video itself, it will try and exit full screen and completely
> lockup.

I cannot repo this using the latest hourly.  No matter where I click in the upper left corner, I get the video paused: with a big 'play' arrow in the middle of the screen.  I cannot detect, at least with my old eyes and glitch as if its trying to 'exit' full-screen mode.
(In reply to comment #10)
> (In reply to comment #9)
> > Jim, I think I found another bug.. if you go to Fullscreen on Hulu, click not
> > on the video controls to pause the video, but somewhere in the upperleft corner
> > of the fullscreen video itself, it will try and exit full screen and completely
> > lockup.
> 
> I cannot repo this using the latest hourly.  No matter where I click in the
> upper left corner, I get the video paused: with a big 'play' arrow in the
> middle of the screen.  I cannot detect, at least with my old eyes and glitch as
> if its trying to 'exit' full-screen mode.

For what we've been testing (User selected Esc or Video Control first), that is fixed, this is slightly different STR.  I can reliably trigger the event during the advertisement video (shortly after it starts to play).  

Normally Hulu's advertising in non full screen mode, clicking on the video area, but not controls, opens a webpage (in a new tab).  In fullscreen mode, clicking on the advertisement (again not the controls) will shut down fullscreen mode and probably attempt to open a webpage.  This is doesn't do, but I get the complete lockup (which closing the browser mentions slow java script).  

If I wait till after the normal video loads (after the advertisement), I get the expected behavior and I see the same thing you do.
I found another hang as well. If you have a fullscreen video running on one screen and you click on a Firefox window displayed on another screen, then the Flash video will exit Fullscreen mode but result in a hang. Clicking on a window other than Firefox or on the Windows background will not result in a hang upon exiting fullscreen.

This is probably related to what Dale is experiencing because in both cases it is the user's interaction with Firefox itself that causes the fullscreen video to lose focus and quit. In Dale's case, his interaction (clicking on the ad) causes a new tab to appear, which takes away focus. In my case, I click on another Firefox window which also takes away focus. Clicking on something that takes away focus but is not part of Firefox doesn't cause a hang.
Attached file Firefox.exe stack dump for new hang (obsolete) —
This is the stack dump for firefox.exe relating to the hang that I posted above. Could also be related to Dale's problem, not sure though.
This is the stack for the flash plugin process relating to the new hang problem.
And sorry to burst your bubble once again, but with the latest Flash Beta (Beta3) Hulu vids still hang upon exiting fullscreen the normal way (pressing esc). Stack dumps attached below.
(In reply to comment #11)
> Normally Hulu's advertising in non full screen mode, clicking on the video
> area, but not controls, opens a webpage (in a new tab).  In fullscreen mode,
> clicking on the advertisement (again not the controls) will shut down
> fullscreen mode and probably attempt to open a webpage.  This is doesn't do,
> but I get the complete lockup (which closing the browser mentions slow java
> script).  

Confirmed. Spun off as bug 550784.
(In reply to comment #17)
> Created an attachment (id=430908) [details]
> Mozilla runtime stack for Hulu hang (w/ Flash Beta)

(In reply to comment #16)
> Created an attachment (id=430907) [details]
> Firefox stack for Hulu hang (w/ Flash beta)

Hey Benjamin, what are you using to get these? Looks like your symbol config isn't set up correctly making these stacks worthless, which is unfortunate!
Attachment #430907 - Attachment mime type: application/octet-stream → text/plain
Attachment #430908 - Attachment mime type: application/octet-stream → text/plain
(In reply to comment #19)
> (In reply to comment #17)
> > Created an attachment (id=430908) [details] [details]
> > Mozilla runtime stack for Hulu hang (w/ Flash Beta)
> 
> (In reply to comment #16)
> > Created an attachment (id=430907) [details] [details]
> > Firefox stack for Hulu hang (w/ Flash beta)
> 
> Hey Benjamin, what are you using to get these? Looks like your symbol config
> isn't set up correctly making these stacks worthless, which is unfortunate!

Looks like windbg dumps? Check your symbols config, we have a wiki page on how to set it up - 

https://developer.mozilla.org/en/Using_the_Mozilla_symbol_server
Yes they are WinDbg dumps. It seems that I can not call .reload /f after I attach WinDbg to processes. I can only type .reload /f if I use the open executable option, which never starts up firefox properly for me. I did a .reload /f with the executable, then I quit the debug session, launched firefox, and attached the debugger to firefox.exe and mozilla-runtime.exe. I am not sure if this is correct. How do I even check if the stacks are worthless?

I've posted updated versions of the new crash stacks(not with Flash beta). Please tell me if they're useless or not. I'll post the Flash beta stacks after you've confirmed that these stacks have the correct symbols loaded.
Attachment #430891 - Attachment is obsolete: true
Attachment #430892 - Attachment is obsolete: true
(In reply to comment #21)
> Yes they are WinDbg dumps. It seems that I can not call .reload /f after I
> attach WinDbg to processes. I can only type .reload /f if I use the open
> executable option, which never starts up firefox properly for me. I did a
> .reload /f with the executable, then I quit the debug session, launched
> firefox, and attached the debugger to firefox.exe and mozilla-runtime.exe. I am
> not sure if this is correct. How do I even check if the stacks are worthless?

If you see warnings like this in fx code - 

WARNING: Stack unwind information not available. Following frames may be wrong.

something isn't loaded properly. For example: 

0030d188 759443f5 USER32!CallWindowProcAorW+0x99
0030d1a8 644f0815 USER32!CallWindowProcW+0x1b
WARNING: Stack unwind information not available. Following frames may be wrong.
0030d1e8 759486ef xul!gfxFont::RunMetrics::CombineWith+0x3f5
0030d214 75948876 USER32!InternalCallWinProc+0x23
0030d28c 759470f4 USER32!UserCallWinProcCheckWow+0x14b
0030d2e8 7594738f USER32!DispatchClientMessage+0xda
0030d310 7731642e USER32!__fnDWORD+0x24
0030d33c 7594914b ntdll!KiUserCallbackDispatcher+0x2e
0030d340 75949180 USER32!NtUserPeekMessage+0xc
0030d368 759492a9 USER32!_PeekMessage+0x73
0030d394 645450f5 USER32!PeekMessageW+0xfb
00000000 00000000 xul!NS_CycleCollectorSuspect2_P+0x925

'xul' is the main mozilla platform library.

(In reply to comment #23)
> Created an attachment (id=430986) [details]
> mozilla-runtime.exe stack dump for new hang v2

What ever you did, these new stacks look good, your child is wrapped up in eval, which I filed as bug 550784. Thanks!
Attachment #430986 - Attachment mime type: application/octet-stream → text/plain
Pfft man this WinDbg is really not intuitive to use. Anyways I am going to post up the Flash beta stacks now. I am not sure if its considered another bug or not, so you'll have to decide whether you want to spin it off.
Well I'll be damned. The flash beta bug no longer occurs with today's build.
Nvm finally caught the stupid Flash beta hang. Hope the stack is correct
Attachment #430907 - Attachment is obsolete: true
(In reply to comment #18)
> (In reply to comment #11)
> > Normally Hulu's advertising in non full screen mode, clicking on the video
> > area, but not controls, opens a webpage (in a new tab).  In fullscreen mode,
> > clicking on the advertisement (again not the controls) will shut down
> > fullscreen mode and probably attempt to open a webpage.  This is doesn't do,
> > but I get the complete lockup (which closing the browser mentions slow java
> > script).  
> 
> Confirmed. Spun off as bug 550784.

Thanks Jim!  I was kinda thinking I needed to do that since its slightly different.  At least we have a cookie trail for these even if they are tackled as new bugs.

(In reply to comment #25)
> Pfft man this WinDbg is really not intuitive to use. Anyways I am going to post
> up the Flash beta stacks now. I am not sure if its considered another bug or
> not, so you'll have to decide whether you want to spin it off.

Your not alone, as that's been my experience, I haven't been able to figure it all out yet. :)
with open executable, don't forget to check debug child processes also ....
@timeless
Yeah I tried that option. I still couldn't not start the browser. Kept giving me a Breakpoint 0 status and a nop.
Flags: in-litmus?
Filed bug 551627 for dual monitor issue.
Depends on: 626491
No longer depends on: 626491
Depends on: 537325
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.