Closed Bug 1185639 Opened 9 years ago Closed 9 years ago

Windows IPC message loop needs more frequent message pumping

Categories

(Core :: IPC, defect)

Unspecified
Windows
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla42
Tracking Status
firefox41 + verified
firefox42 --- fixed
relnote-firefox --- 41+

People

(Reporter: bugzilla, Assigned: bugzilla)

References

Details

Attachments

(1 file, 3 obsolete files)

Right now our Windows IPC loop only pumps messages when the loop is waiting for an incoming message. This is insufficient in certain circumstances. For example, if there are multiple async messages in the pending queue, we won't pump until the queue has been drained. We should also be pumping message in between consecutive event dispatches.
Attached patch Patch (obsolete) — Splinter Review
https://treeherder.mozilla.org/#/jobs?repo=try&revision=775633997ffa IPDL test suite looks good too. Right now I've added some RAII wrappers around neutering. I think in the long term this code could use some cleanup and refactoring but I want to minimize its footprint right now. Too bad bent isn't with us anymore to review this. Jim, are you comfortable reviewing Window neutering code?
Attachment #8637520 - Flags: review?(jmathies)
(Also I think this might help with some of your mysterious e10s hangs in user32 APIs)
Attached patch Patch (obsolete) — Splinter Review
Whoops, uploaded the wrong file.
Attachment #8637520 - Attachment is obsolete: true
Attachment #8637520 - Flags: review?(jmathies)
Attachment #8637522 - Flags: review?(jmathies)
Comment on attachment 8637522 [details] [diff] [review] Patch Review of attachment 8637522 [details] [diff] [review]: ----------------------------------------------------------------- ::: ipc/glue/MessageChannel.cpp @@ +1034,5 @@ > return false; > } > > +#ifdef OS_WIN > + neuteredRgn.PumpOnce(); comment me ::: ipc/glue/Neutering.h @@ +4,5 @@ > + * License, v. 2.0. If a copy of the MPL was not distributed with this > + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */ > + > +#ifndef mozilla_ipc_Neutering_h > +#define mozilla_ipc_Neutering_h new file / new class(s), lets add commenting in here for things like public methods, and the classes. What's a NeuteredWindowRegion? What the heck is a DeneuteredWindowRegion? :) ::: ipc/glue/WindowsMessageLoop.cpp @@ +1115,2 @@ > } > + DeneuteredWindowRegion deneuteredRgn; This one is confusing to me. If gWindowHook is set then neutering is active and mReneuter in DeneuteredWindowRegion is true. When we create this object we turn neutering off, and when we destroy it we will turn neutering behavior back on?
(In reply to Jim Mathies [:jimm] from comment #4) > Comment on attachment 8637522 [details] [diff] [review] > Patch > > Review of attachment 8637522 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: ipc/glue/MessageChannel.cpp > @@ +1034,5 @@ > > return false; > > } > > > > +#ifdef OS_WIN > > + neuteredRgn.PumpOnce(); > > comment me Will do > > ::: ipc/glue/Neutering.h > @@ +4,5 @@ > > + * License, v. 2.0. If a copy of the MPL was not distributed with this > > + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */ > > + > > +#ifndef mozilla_ipc_Neutering_h > > +#define mozilla_ipc_Neutering_h > > new file / new class(s), lets add commenting in here for things like public > methods, and the classes. What's a NeuteredWindowRegion? What the heck is a > DeneuteredWindowRegion? :) OK > > ::: ipc/glue/WindowsMessageLoop.cpp > @@ +1115,2 @@ > > } > > + DeneuteredWindowRegion deneuteredRgn; > > This one is confusing to me. If gWindowHook is set then neutering is active > and mReneuter in DeneuteredWindowRegion is true. > > When we create this object we turn neutering off, and when we destroy it we > will turn neutering behavior back on? Yeah, DeneuteredWindowRegion is supposed to be analogous to MutexAutoUnlock; we're in a region that already has neutering and we temporarily need to suspend it for the remaining duration of the block.
(In reply to Aaron Klotz [:aklotz] (please use needinfo) from comment #2) > (Also I think this might help with some of your mysterious e10s hangs in > user32 APIs) why is that?
(In reply to Aaron Klotz [:aklotz] (please use needinfo) from comment #5) > Yeah, DeneuteredWindowRegion is supposed to be analogous to MutexAutoUnlock; > we're in a region that already has neutering and we temporarily need to > suspend it for the remaining duration of the block. Ok so we set up the neutering much sooner now, at the top of MessageChannel::Send. I'm not sure what the side effects of that will be. For example there's a send on mLink that gets wrapped by this now. We should be on the lookout for regressions and not be aggressive in uplifting this work.
Comment on attachment 8637522 [details] [diff] [review] Patch Review of attachment 8637522 [details] [diff] [review]: ----------------------------------------------------------------- ::: ipc/glue/WindowsMessageLoop.cpp @@ +916,5 @@ > + if (gCOMWindow && ::PeekMessageW(&msg, gCOMWindow, 0, 0, PM_REMOVE)) { > + ::TranslateMessage(&msg); > + ::DispatchMessageW(&msg); > + } > + ::PeekMessageW(&msg, nullptr, 0, 0, PM_NOREMOVE); What's this PeekMessage call doing? Expunging non-queued messages?
Attachment #8637522 - Flags: review?(jmathies) → review+
(In reply to Jim Mathies [:jimm] from comment #6) > (In reply to Aaron Klotz [:aklotz] (please use needinfo) from comment #2) > > (Also I think this might help with some of your mysterious e10s hangs in > > user32 APIs) > > why is that? If one side of a connection has Recv* or Answer* handlers that call user32 APIs, and the other side's MessageChannel has multiple IPC messages pending, messages won't be pumped without this patch. (In reply to Jim Mathies [:jimm] from comment #8) > Comment on attachment 8637522 [details] [diff] [review] > Patch > > Review of attachment 8637522 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: ipc/glue/WindowsMessageLoop.cpp > @@ +916,5 @@ > > + if (gCOMWindow && ::PeekMessageW(&msg, gCOMWindow, 0, 0, PM_REMOVE)) { > > + ::TranslateMessage(&msg); > > + ::DispatchMessageW(&msg); > > + } > > + ::PeekMessageW(&msg, nullptr, 0, 0, PM_NOREMOVE); > > What's this PeekMessage call doing? Expunging non-queued messages? Yes. I've added comments for that.
Attached patch Patch (r2) (obsolete) — Splinter Review
Added comments. Carrying forward r+.
Attachment #8637522 - Attachment is obsolete: true
Attachment #8639391 - Flags: review+
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla42
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla42 → ---
Attached patch Patch (r3)Splinter Review
Whoops, something made it into patch r2 that wasn't supposed to be there. My bad.
Attachment #8639391 - Attachment is obsolete: true
Attachment #8639977 - Flags: review+
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla42
Depends on: 1189709
This appears to have caused regression 1189709
Comment on attachment 8639977 [details] [diff] [review] Patch (r3) Approval Request Comment [Feature/regressing bug #]: NPAPI plugins [User impact if declined]: Hangs, see bug 1207766 and bug 1209235 [Describe test coverage new/current, TreeHerder]: Plugin test suite [Risks and why]: Low at this point, this has ridden the trains on 42 to beta. Note that the patch in bug 1189709 must be landed alongside this one. [String/UUID change made/needed]: None
Attachment #8639977 - Flags: approval-mozilla-release?
Depends on: 1209464
Comment on attachment 8639977 [details] [diff] [review] Patch (r3) Approving for uplift to moz-release to be included in 41.0.1 build2. This patch is needed for fix the recent flash hangs (bug 1207766).
Attachment #8639977 - Flags: approval-mozilla-release? → approval-mozilla-release+
During our Firefox 41.0.1 build 2 smoke testing we ran a number of plugin tests that included Flash across platforms (Windows 10 32bit, Windows 7 64bit, Mac OS X 10.10.5 and Ubuntu 14.04 32-bit) and we can say that we did not encounter any hangs or crashes. Marking this verified fixed on FF 41.
Added to the release notes with "Fix potential hangs with Flash plugins" as wording.
After some of our Windows computers updated to FireFox 40.0.1 we noticed that FireFox has started to hang when accessing BigBlueButton. We are able to reproduce the hang with the following configuration Windows 10 FireFox 40.0.1 Flash 19.0.0.185 Steps to reproduce on demo.bigbluebutton.org (our public demo server) 1. Open http://demo.bigbluebutton.org/demo/demo2.jsp 2. Enter any name 3. Click 'Join' 4. Wait until the BigBlueButton client loads. 5. Click 'Listen Only' ** FireFox may hang at this point ** 6. Refresh page go to Step 4 You might have to repeat steps (4)-(6) a few times to get FireFox to hang. I've uploaded a video to YouTube that demonstrates https://youtu.be/NzryyxQyvHo Most of the times when it hangs, FireFox locks up for good and I have to close it from the task bar. This bug is bad for us as it potentially prevents a large number of students and teachers from using BigBlueButton with the latest version of FireFox. Regards,... Fred P.S. BigBlueButton is an open source web conferencing system for on-line learning. We are one of the Mozilla WebFWD graduates. See: https://webfwd.org/portfolio/.
The correct configuration to comment 23 is Windows 10 FireFox 41.0.1 Flash 19.0.0.185 We just had our computers update to FireFox 41.0.1.
A bit more information. We downgraded to FireFox 41.0.0 and tested on same computer that was hanging with 41.0.1. After 20 minutes of aggressive testing I couldn't get FireFox to hang when joining a BigBlueButton session. Upgraded again to 41.0.1 and it hangs almost immediately. BigBlueButton is open source and localized into 25+ languages. The project has been under heavy development for over seven years. We have a very large community (tens of thousands) of teachers and students world-wide that use BigBlueButton every day for on-line classes. This is going to be painful for us as many of them use FireFox. FireFox 41.0.1 hangs -- hard. Is there anyway to back out this change or quickly triage? Let us know if there is anything we can do to help. Regards,... Fred
Let's take this discussion to bug 1210665.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: