Closed Bug 554262 Opened 10 years ago Closed 10 years ago

[OOPP] Silverlight context menu hangs the browser

Categories

(Firefox :: General, defect)

x86
Windows 7
defect
Not set

Tracking

()

RESOLVED FIXED
Tracking Status
status1.9.2 --- .4-fixed

People

(Reporter: mozbugz, Assigned: cjones)

References

()

Details

(Whiteboard: [testcase-wanted][fixed-lorentz])

Attachments

(2 files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.3a4pre) Gecko/20100322 Minefield/3.7a4pre (.NET CLR 3.5.30729)
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.3a4pre) Gecko/20100322 Minefield/3.7a4pre (.NET CLR 3.5.30729)

if you try to bring up the context menu on Silverlight, it currently hangs the browser and doesn't go away and I got our famous aero peek progress circles also with Win 7.  I had to kill Mozilla-Runtime to get back the browser.



Reproducible: Didn't try




Be interesting to find out if bug 547359 affects this.  Also see bug 545149.
Blocks: OOPP
Version: unspecified → Trunk
Found this testing Hourly: 20100322224355
http://hg.mozilla.org/mozilla-central/rev/7bd6238a54ac
Attached file stacks
Hmm, this isn't a windows ui message hang, it's something else. I'm only able to reproduce this on this win7 phone site. Not sure if this needs to hold up this weeks planned beta.
Interesting, both spin loop and the wait for notify loop in child are running.
The sequence of events here is:

parent main thread:
  npp_handleevent (mouseup)

child main thread
  npp_handleevent
  -> TrackPopupMenu: tells parent to spininternaleventloop
  -> NPN_GetProperty (RPC outcall)

parent I/O thread:
  -> receives the spininternalevenntloop notification
  -> receives the NPN_GetProperty message, queues it, posts gEventLoopMessage

parent main thread:
  starts processing nested events, WM_PAINT
  -> call NPP_HandleEvent(WM_PAINT)
  ** paint wins the race, the pending NPN_GetProperty message is not processed
  -> Inside RPCChannel::WaitForNotify we SpinInternalEventLoop again. This is a bug. This SpinInternalEventLoop slurps up the gEventLoopMessage

>	mozilla::ipc::RPCChannel::SpinInternalEventLoop() Line 645	C++
 	mozilla::ipc::RPCChannel::WaitForNotify() Line 819	C++
 	mozilla::ipc::RPCChannel::Call(msg=0x07d4f540, reply=0x0015c228) Line 203	C++
 	mozilla::plugins::PPluginInstanceParent::CallPaint(event={...}, handled=0x0015c268) Line 360	C++
 	mozilla::plugins::PluginInstanceParent::NPP_HandleEvent(event=0x0015c358) Line 546	C++
 	mozilla::plugins::PluginModuleParent::NPP_HandleEvent(instance=0x04961d8c, event=0x0015c358) Line 487	C++
 	nsNPAPIPluginInstance::HandleEvent(event=0x0015c358, handled=0x0015c36c) Line 1378	C++
 	nsPluginInstanceOwner::Paint(aDirty={...}, aDC=0x4f010a36) Line 4852	C++
 	nsObjectFrame::PaintPlugin(aRenderingContext={...}, aDirtyRect={...}, aPluginRect={...}) Line 1762	C++
 	nsDisplayPlugin::Paint(aBuilder=0x0015c850, aCtx=0x07d4b8c0) Line 1152	C++
 	nsDisplayList::PaintThebesLayers(aBuilder=0x0015c850, aLayers={...}) Line 851	C++
 	nsDisplayList::Paint(aBuilder=0x0015c850, aCtx=0x00000000, aFlags=0x00000001) Line 792	C++
 	nsLayoutUtils::PaintFrame(aRenderingContext=0x00000000, aFrame=0x00000000, aDirtyRegion={...}, aBackstop=0xffffffff, aFlags=0x00000004) Line 1241	C++
 	PresShell::Paint(aDisplayRoot=, aViewToPaint=, aWidgetToPaint=, aDirtyRegion=, aPaintDefaultBackground=) Line 5679	C++
 	_cairo_win32_save_initial_clip(hdc=0x0015cb88, surface=0x0000003c) Line 2281	C
 	nsRegion::Or(aRegion={...}, aRect={...}) Line 828	C++
 	nsViewManager::Refresh(aView=0x051bb700, aWidget=0x04d51e60, aRegion={...}, aUpdateFlags=0x0015cc5c) Line 428	C++
 	nsViewManager::DispatchEvent(aEvent=0x0015cd38, aView=0x051bb700, aStatus=0x0015cc8c) Line 877	C++
 	HandleEvent(aEvent=0x00000001) Line 161	C++
 	nsWindow::DispatchEvent(event=0x0015cd38, aStatus=nsEventStatus_eIgnore) Line 3134	C++
 	nsWindow::DispatchWindowEvent(event=0x0015cd38, aStatus=nsEventStatus_eIgnore) Line 3163	C++
 	nsWindow::OnPaint(aDC=0x00000000) Line 535	C++
 	nsWindow::ProcessMessage(msg=0x0000000f, wParam=0x00000000, lParam=0x00000000, aRetValue=0x0015cfd0) Line 4166	C++
 	nsWindow::WindowProc(hWnd=0x00000001, msg=0x0000000f, wParam=0x00000000, lParam=0x00000000) Line 3865	C++
 	_InternalCallWinProc@20()	
 	_UserCallWinProcCheckWow@32()	
 	_DispatchClientMessage@20()	
 	___fnDWORD@4()	
 	_KiUserCallbackDispatcher@12()	
 	_NtUserDispatchMessage@4()	
 	_DispatchMessageWorker@8()	
 	_DispatchMessageW@4()	
 	mozilla::ipc::RPCChannel::SpinInternalEventLoop() Line 641	C++
 	mozilla::ipc::RPCChannel::WaitForNotify() Line 819	C++
 	mozilla::ipc::RPCChannel::Call(msg=0x06bf9cc0, reply=0x0015d2f0) Line 203	C++
 	mozilla::plugins::PPluginInstanceParent::CallNPP_HandleEvent(event={...}, handled=0x0015d330) Line 328	C++
 	mozilla::plugins::PluginInstanceParent::NPP_HandleEvent(event=0x0015d858) Line 604	C++
 	mozilla::plugins::PluginModuleParent::NPP_HandleEvent(instance=0x04961d8c, event=0x0015d858) Line 487	C++
 	nsNPAPIPluginInstance::HandleEvent(event=0x0015d858, handled=0x0015d430) Line 1378	C++
 	nsPluginInstanceOwner::ProcessEvent(anEvent={...}) Line 4499	C++
 	nsPluginInstanceOwner::DispatchMouseToPlugin(aMouseEvent=0x07d33a68) Line 3893	C++
 	nsPluginInstanceOwner::MouseUp(aMouseEvent=0x07d33a60) Line 3875	C++

When control exits back out to the *outer* mozilla::ipc::RPCChannel::SpinInternalEventLoop, it doesn't break out to process the pending NPP_GetProperty RPC call and hangs.

This should be partly fixed by bug 553606, but jimm says it doesn't completely solve the problem.
Blocks: LorentzBeta1
Status: UNCONFIRMED → NEW
Depends on: 553606
Ever confirmed: true
If we don't have a pending event loop message that was posted when getproperty call was deferred, I think this hang can happen even if spin loop is limited to single depth. Couldn't the same thing happen if:

child main thread
  npp_handleevent mouse up
  -> TrackPopupMenu: tells parent to spininternaleventloop
  -> NPN_GetProperty (RPC outcall)

parent I/O thread:
  -> receives the spininternalevenntloop notification
  -> receives the NPN_GetProperty message, defers it, posts gEventLoopMessage

parent main thread:
  -> WaitForNotify on mouse up
  ** clears deferred messages, drops into spininternaleventloop
  ** starts processing nested events, WM_PAINT
  -> call NPP_HandleEvent(WM_PAINT)
  ** paint wins the race, the deferred NPN_GetProperty message is not processed
  -> WaitForNotify on paint
  -> paint finishes, stack pops
  ** we're back in spininternaleventloop, no gEventLoopMessage will come.

If posted gEventLoopMessage events matched messages to process, this wouldn't be an issue. But they don't. 

I'm wondering if a simple check of mDeferred in spininternaleventloop would fix this.
It took me a while to understand this problem, but I think I may have found a relatively simple solution.  Per comment 4, the bug is

  call NPP_HandleEvent
      WaitForNotify
          Peek(gSpinEventLoop)
          SpinInternalLoop
              WaitMessage()
              DispatchMessage(WM_PAINT)
              |   call Paint
              |   |   WaitForNotify
              |   |   |   SpinInternalLoop
              |   |   |   |   Peek(gEventMessage)  [<---GetProperty]
              |   |   |   |   return true
              |   |   |   return true
              |   |   [RPC race resolution, child loses]
              |   |   defer GetProperty
              |   |   WaitForNotify
              |   |   |   SpinInternalLoop
              |   |   |   |   Peek(gEventMessage)  [<---Reply_Paint]
              |   |   |   |   return true
              |   |   |   return true
              |   |   [got Paint reply]
              |   |   return true
              |   return
!!!BUG!!!====>WaitMessage()

Browser stays responsive, but plugin is blocked forever, and browser is stuck in modal loop forever.  More precisely, we never break out of SpinInternalLoop() for the deferred in-call we received.

However, there's a strong invariant here: SpinInternalEventLoop() always has a Call() below it on the stack, stuck in a while(!recvd reply) loop.  So, I think 

diff --git a/ipc/glue/WindowsMessageLoop.cpp b/ipc/glue/WindowsMessageLoop.cpp
--- a/ipc/glue/WindowsMessageLoop.cpp
+++ b/ipc/glue/WindowsMessageLoop.cpp
@@ -632,16 +632,17 @@ RPCChannel::SpinInternalEventLoop()
       } else {
           TranslateMessage(&msg);
           DispatchMessageW(&msg);
+          return true;
       }
     } else {

would fix WindowsMessageLoop.  You would break out of the first spinloop and back into RPCChannel::Call(NPP_HandleEvent), where the deferred message should be processed.  After that's done, you'd go back into WaitForNotify->SpinInternalLoop.  (Except that we'd blow up trying to pop an empty queue.  But that's just an RPCChannel bug.  I'll post a patch for both here in a tick.)
I found a race condition with this solution in which we can process the reply to NPP_HandleEvent before the deferred GetProperty.  That would be mucho bad, looking for a fix.
(In reply to comment #7)
> I found a race condition with this solution in which we can process the reply
> to NPP_HandleEvent before the deferred GetProperty.  That would be mucho bad,
> looking for a fix.

Nm, nonsensical ramblings of a person starting to become sleep deprived.  The plugin must receive a reply to GetProperty before it can reply to NPP_HandleEvent.  So I think this solution will work.  (It can do any random garbage before or after breaking the nested loop, but then we're back to the "base case".)
bent or jimm, would it be possible to cook up a windows-only IPDL/C++ testcase for this?
Whiteboard: [testcase-wanted]
Comment on attachment 434492 [details] [diff] [review]
Don't leave in-calls deferred during an iteration of the windows event loop in limbo

   DispatchMessageW(&msg);
+  return true;

should be 

   DispatchMessageW(&msg);
+  ExitSpinLoop();
+  return false;

after the patch in bug 553606 lands.
Attachment #434492 - Flags: review?(jmathies) → review+
(In reply to comment #11)
> (From update of attachment 434492 [details] [diff] [review])
>    DispatchMessageW(&msg);
> +  return true;
> 
> should be 
> 
>    DispatchMessageW(&msg);
> +  ExitSpinLoop();
> +  return false;
> 
> after the patch in bug 553606 lands.

The return code from SpinInternalEventLoop() isn't checked.  Is that a bug?
Comment on attachment 434492 [details] [diff] [review]
Don't leave in-calls deferred during an iteration of the windows event loop in limbo

Looks sound to me.
Attachment #434492 - Flags: review?(bent.mozilla) → review+
http://hg.mozilla.org/mozilla-central/rev/0c41bf595db4
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Using latest hourly, still hangs browser.. and once the Silverlight tag appears, it won't dismiss...eventually the plugin crashed.

Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.3a4pre) Gecko/20100324 Minefield/3.7a4pre Firefox/3.6 ID:20100324152307
(In reply to comment #15)
> Using latest hourly, still hangs browser.. and once the Silverlight tag
> appears, it won't dismiss...eventually the plugin crashed.
> 
> Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.3a4pre) Gecko/20100324
> Minefield/3.7a4pre Firefox/3.6 ID:20100324152307

I clicked on the context menu, and it opened up the menu, and I could dismiss it. I'm using Win 7 32bit. Sometimes Silverlight is not recognize the clicks (using wireless MS Arc mouse with Intellipoint.) but definitely didn't hang the browser or crash the plugin on me using version  3.0.50106.0.
http://hg.mozilla.org/projects/firefox-lorentz/rev/8c9eeda8a5dc
Flags: in-testsuite?
Flags: in-litmus?
Whiteboard: [testcase-wanted] → [testcase-wanted][fixed-lorentz]
Blanket approval for Lorentz merge to mozilla-1.9.2
a=beltzner for 1.9.2.4 - please make sure to mark status1.9.2:.4-fixed
https://litmus.mozilla.org/show_test.cgi?id=12012 added to Litmus in the BFTs.
Flags: in-litmus? → in-litmus+
Assignee: nobody → jones.chris.g
You need to log in before you can comment on or make changes to this bug.