Closed Bug 981951 Opened 10 years ago Closed 10 years ago

We should remove following char message immediately after we find it at handling keydown message

Categories

(Core :: Widget: Win32, defect)

x86_64
Windows 8.1
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla30
Tracking Status
firefox28 --- verified
firefox29 --- verified
firefox30 --- verified
b2g-v1.3 --- fixed
b2g-v1.3T --- fixed
b2g-v1.4 --- fixed

People

(Reporter: masayuki, Assigned: masayuki)

References

Details

(Keywords: crash, topcrash-metro)

Crash Data

The crash reason might be another thread or process post a key message between PeekMessage(PM_NOREMOVE) and PeekMessage(PM_REMOVE). If so, we may be confused by such unexpected message order.

We should remove found char message as soon as possible when we find a following char message and collect more details of the queue status when we call MOZ_CRASH().
This was fixed by attachment 8369857 [details] [diff] [review]. And the crash signature is changed to [@ mozilla::widget::NativeKey::GetFollowingCharMessage(tagMSG&)].

(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #25)
> Created attachment 8369457 [details] [diff] [review]
> Append unexpected message log to crash report
> 
> https://tbpl.mozilla.org/?tree=Try&usebuildbot=1&rev=d8399c123969
> 
> I hope that this would give us more hints.
> 
> I'm thinking the possibilities of this bug are:
> 
> Another thread/process post or send message to our main thread. Then, input
> message gives lower priority than it. If the found char message comes from
> device and somebody post/send a key message between IsFollowingCharMessage()
> and RemoveFollowingCharMessage(), this may cause this bug.
> 
> This sounds like not realistic scenario for me. However, if application such
> as a11y tools post multiple key messages to us sequentially, our thread may
> peek WM_(SYS)KEYDOWN message during the sequence. I guess that this might be
> possible. (I assume that WM_(SYS)(DEAD)CHAR messages are always queued as
> input message from a device.)
> 
> Anyway, I want to see more detail of what happens as far as possible.

(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #29)
> Created attachment 8369857 [details] [diff] [review]
> Append unexpected message log to crash report (landed on m-c, aurora and
> beta)

(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #30)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/d8b63edf935b

(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #31)
> https://hg.mozilla.org/mozilla-central/rev/d8b63edf935b
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
Keywords: verifyme
Oops, attachment 8372896 [details] [diff] [review] actually fixes this bug.

(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #35)
> https://crash-stats.mozilla.com/report/index/a67fe373-0036-4d9f-83d4-
> db3ab2140206
> Handling message: WM_KEYDOWN (0x00000100), wParam: 0x00000020, lParam:
> 0x00390001, InSendMessageEx()=ISMEX_NOSEND,
> Next key message: Unknown (0x00000000), wParam: 0x00000000, lParam:
> 0x00000000
> 
> https://crash-stats.mozilla.com/report/index/734ad868-1822-4a73-b284-
> b8ad42140206
> Handling message: WM_KEYDOWN (0x00000100), wParam: 0x00000020, lParam:
> 0x00390001, InSendMessageEx()=ISMEX_NOSEND,
> Next key message: Unknown (0x00000000), wParam: 0x00000000, lParam:
> 0x00000000 
> 
> https://crash-stats.mozilla.com/report/index/1118149d-3df8-4101-a5b7-
> 02e372140206
> Handling message: WM_KEYDOWN (0x00000100), wParam: 0x00000020, lParam:
> 0x00390001, InSendMessageEx()=ISMEX_NOSEND, Next key message: Unknown
> (0x00000000), wParam: 0x00000000, lParam: 0x00000000 
> 
> These logs are odd. PeekMessage() succeeded but the got message is WM_NULL!

(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #36)
> Odd... I cannot emulate the situation...
> 
> https://tbpl.mozilla.org/?tree=Try&usebuildbot=1&rev=31bbff97c227

(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #37)
> Came from desktop:
> 
> https://crash-stats.mozilla.com/report/index/00db1124-68ff-49b3-8205-
> ca9b42140206
> Handling message: WM_KEYDOWN (0x00000100), wParam: 0x00000033, lParam:
> 0x40040001, InSendMessageEx()=ISMEX_NOSEND,
> Next key message: WM_KEYDOWN (0x00000100), wParam: 0x00000033, lParam:
> 0x40040001
> 
> https://crash-stats.mozilla.com/report/index/0953c2a9-c93e-45e8-a730-
> c14862140206
> Handling message: WM_KEYDOWN (0x00000100), wParam: 0x00000034, lParam:
> 0x00050001, InSendMessageEx()=ISMEX_NOSEND,
> Next key message: WM_KEYUP (0x00000101), wParam: 0x00000034, lParam:
> 0xC0050001 
> 
> https://crash-stats.mozilla.com/report/index/8fd97970-6cec-40c4-8495-
> fbcaa2140206
> Handling message: WM_KEYDOWN (0x00000100), wParam: 0x0000001B, lParam:
> 0x00010001, InSendMessageEx()=ISMEX_NOSEND,
> Next key message: WM_SYSKEYDOWN (0x00000104), wParam: 0x00000012, lParam:
> 0x20380001
> 
> These logs are not so odd to me because these are possible as I said in
> comment 25.
> 
> I'll create a patch for this, first. I'm still thinking the reason why
> PeekMessage() returns WM_NULL on Metrofox.

(In reply to Jim Mathies [:jimm] (away 4/1-4/21) from comment #38)
> > I'll create a patch for this, first. I'm still thinking the reason why
> > PeekMessage() returns WM_NULL on Metrofox.
> 
> winrt is really aggressive about checking app event loops for hangs. WM_NULL
> is what it uses to do this. Is it possible these null events are getting
> injected at a point our keyboard handling doesn't expect?

(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #39)
> (In reply to Jim Mathies [:jimm] from comment #38)
> > > I'll create a patch for this, first. I'm still thinking the reason why
> > > PeekMessage() returns WM_NULL on Metrofox.
> > 
> > winrt is really aggressive about checking app event loops for hangs. WM_NULL
> > is what it uses to do this. Is it possible these null events are getting
> > injected at a point our keyboard handling doesn't expect?
> 
> I guessed so. But I cannot reproduce the bug with this patch, see comment 36.
> https://hg.mozilla.org/try/rev/cd4740da9560
> 
> I don't understand why the PeekMessage() returns WM_NULL since we pass
> WM_KEYFIRST and WM_KEYLAST to it.

(In reply to Jim Mathies [:jimm] (away 4/1-4/21) from comment #40)
> (In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #39)
> > (In reply to Jim Mathies [:jimm] from comment #38)
> > > > I'll create a patch for this, first. I'm still thinking the reason why
> > > > PeekMessage() returns WM_NULL on Metrofox.
> > > 
> > > winrt is really aggressive about checking app event loops for hangs. WM_NULL
> > > is what it uses to do this. Is it possible these null events are getting
> > > injected at a point our keyboard handling doesn't expect?
> > 
> > I guessed so. But I cannot reproduce the bug with this patch, see comment 36.
> > https://hg.mozilla.org/try/rev/cd4740da9560
> > 
> > I don't understand why the PeekMessage() returns WM_NULL since we pass
> > WM_KEYFIRST and WM_KEYLAST to it.
> 
> Well PostMessage is a little different from SendMessage/SendMessageTimeout.
> Sent messages get priority in the event queue.

(In reply to Jim Mathies [:jimm] (away 4/1-4/21) from comment #41)
> I agree though, it's really odd that WM_NULL would be peeked as a keyboard
> event.

(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #42)
> Posted message has higher priority than input message from device. So, if
> PeekMessage() can return WM_NULL in that case, the test build should crash,
> but I cannot reproduce it.
> 
> We should confirm one more thing. That is, when PeekMessage() returns
> WM_NULL, whether the WM_*CHAR message is still in the queue or not. I guess
> that it's not removed. WM_NULL is inserted before WM_*CHAR. If so,
> PeekMessage() should retry until it returns WM_*CHAR.

(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #43)
> Created attachment 8372896 [details] [diff] [review]
> Remove following char message of keydown message with found message (landed
> on m-c, aurora and beta)
> 
> This patch tries to remove found char message and pass it to handling method.

(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #44)
> I don't understand well about the issue of WM_NULL. But I guess that
> Metrofox shouldn't use WM_NULL. It might better to use user or app defined
> message or registered message instead.

(In reply to Jim Mathies [:jimm] (away 4/1-4/21) from comment #45)
> (In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #44)
> > I don't understand well about the issue of WM_NULL. But I guess that
> > Metrofox shouldn't use WM_NULL. It might better to use user or app defined
> > message or registered message instead.
> 
> Windows uses WM_NULL and SendMessageTimeout to check if our event loop is
> processing messages in a timely fashion. If it isn't, explorer kills the
> browser.

(In reply to Jim Mathies [:jimm] (away 4/1-4/21) from comment #46)
> (In reply to Jim Mathies [:jimm] from comment #45)
> > (In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #44)
> > > I don't understand well about the issue of WM_NULL. But I guess that
> > > Metrofox shouldn't use WM_NULL. It might better to use user or app defined
> > > message or registered message instead.
> > 
> > Windows uses WM_NULL and SendMessageTimeout to check if our event loop is
> > processing messages in a timely fashion. If it isn't, explorer kills the
> > browser.
> 
> On Vista and up, this would result in a non-responding ghost window. I don't
> think this is specific to Metro.

(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #47)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/ada4aec5053f

(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #48)
> https://hg.mozilla.org/mozilla-central/rev/ada4aec5053f
The patch also avoids crashing when we retrieve WM_NULL on MetroFirefox. It's really odd behavior of PeekMessage(). We use WM_KEYFIRST and WM_KEYLAST for filtering the message, but WM_NULL is out of the range... MSDN doesn't mention about this special behavior.
You need to log in before you can comment on or make changes to this bug.