Closed
Bug 531341
Opened 15 years ago
Closed 15 years ago
Stack overflow crash related to scrolling with trackpad and plugins [@ nsWindow::HandleScrollingPlugins] [@ _SEH_prolog4 ][@ nsWindow::ProcessMessage(unsigned int, unsigned int&, long&, long*)][@ FindNCHit ]
Categories
(Core :: Widget: Win32, defect)
Tracking
()
VERIFIED
FIXED
mozilla1.9.3a1
Tracking | Status | |
---|---|---|
blocking1.9.2 | --- | needed |
status1.9.2 | --- | .2-fixed |
status1.9.1 | --- | unaffected |
People
(Reporter: dncook, Assigned: masayuki)
References
Details
(4 keywords, Whiteboard: [crashkill])
Crash Data
Attachments
(2 files, 1 obsolete file)
66.38 KB,
text/plain
|
Details | |
4.83 KB,
patch
|
emk
:
review+
beltzner
:
approval1.9.2.2+
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2b4) Gecko/20091124 Firefox/3.6b4
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2b4) Gecko/20091124 Firefox/3.6b4
I have experienced three crashes (see below, two on 3.6b3, one on 3.6b4) which all seem to stem from the same recursive calls. The crashes occurred while I was opening pages with flash, (last.fm, youtube, and a google search result) scrolling with my trackpad nub, (Lenovo) and possibly switching tabs. I have tried to replicate this, but thus far I haven't been able to.
Reproducible: Couldn't Reproduce
Steps to Reproduce:
Unable to reproduce, but I will keep trying
Actual Results:
Slight pause, (0.5-1s) then Firefox crashed, and the crash reporter opened up
Expected Results:
Not crashing
bp-69b5d4c8-995e-42bb-9965-df4ec2091126 (under 3.6b4)
bp-c287da93-c960-48b4-a03f-551002091123 (under 3.6b3)
bp-11b1e782-eb83-497c-b560-4e9ea2091120 (under 3.6b3)
The infinite loop is between the following three functions:
nsWindow::HandleScrollingPlugins
nsWindow::OnScroll
xul.dll@0x42e0e9
The line referenced in nsWindow::OnScroll is within an if statement mentioning "Scroll message generated by Thinkpad Trackpoint Driver or similar"
My computer is a Lenovo ThinkPad R61
Reporter | ||
Updated•15 years ago
|
Version: unspecified → 3.6 Branch
6253 xul.dll nsWindow::HandleScrollingPlugins widget/src/windows/nsWindow.cpp:5944
6254 xul.dll nsWindow::OnScroll widget/src/windows/nsWindow.cpp:5965
6255 xul.dll xul.dll@0x42cc36
6256 xul.dll nsWindow::HandleScrollingPlugins widget/src/windows/nsWindow.cpp:5944
6257 xul.dll nsWindow::OnScroll widget/src/windows/nsWindow.cpp:5965
6258 xul.dll xul.dll@0x42cc36
6259 xul.dll nsWindow::WindowProc widget/src/windows/nsWindow.cpp:3501
6260 user32.dll InternalCallWinProc
Severity: normal → critical
Component: General → Widget: Win32
Keywords: crash
Product: Firefox → Core
QA Contact: general → win32
Summary: Stack overflow crash related to scrolling with trackpad and plugins → Stack overflow crash related to scrolling with trackpad and plugins [@ nsWindow::HandleScrollingPlugins]
Version: 3.6 Branch → 1.9.2 Branch
Reporter | ||
Comment 2•15 years ago
|
||
I think I have figured out how to reproduce this bug:
1: Open Firefox, and open two empty tabs
2: Hold the scroll button down and hold the TrackPoint nub in any direction (This requires a ThinkPad, was not able to reproduce by scrolling with a normal mouse)
3: Press Ctrl-Tab, Firefox will crash
Reporter | ||
Comment 3•15 years ago
|
||
I bisected through the nightlies, and in 2009-11-14-05-mozilla-1.9.2 and earlier, the crash doesn't occur, and moreover, scrolling with the TrackPoint doesn't work at all. In 2009-11-15-04-mozilla-1.9.2 and later, scrolling with the TrackPoint works, and the crash occurs.
Comment 4•15 years ago
|
||
Topcrash #115 for 3.6b4
nsWindow::ProcessMessage(unsigned int, unsigned int&, long&, long*)
bp-2751f520-e4f6-46a1-a2aa-dbc352091205
Since this is a stack overflow, it probably appears with multiple signatures.
Keywords: topcrash
Indeed, bug 531856 would appear to be the same.
Assignee: nobody → me
Status: UNCONFIRMED → NEW
Ever confirmed: true
From bug 531856
> Stack overflow. Possibly nsWindow::HandleScrollingPlugins round-trips to
> itself?
>
> Note stack contains "xul.dll@0x42cc36" -- any idea what that call is?
>
> Example Beta4 stack:
> http://crash-stats.mozilla.com/report/index/7047c8c5-1a19-408a-bb8c-20f522091130
>
> Example: Trunk stack:
> http://crash-stats.mozilla.com/report/index/22b61f9a-55a5-47a2-9a82-01b802091125
I can't reproduce this, but judging by the fact that there aren't many ways to trigger a stack overflow I would imagine that the problem is that at http://hg.mozilla.org/mozilla-central/annotate/ae44b06ea876/widget/src/windows/nsWindow.cpp#l6013
> <snip>
> if (destWnd != mWnd) {
> if (destWindow) {
> aHandled = destWindow->ProcessMessage(aMsg, aWParam, aLParam, aRetValue);
> </snip>
either destWindow is the window currently processing the message or slightly more interestingly, destWindow and the window currently processing the message are sending it back and forth. The fact that this seems to happen when switching tabs (per comment 2 ) makes me think the second one is probable. The first should be impossible (I don't believe there's a situation in which GetNSWindowPtr can return the same nsWindow if the HWNDs differ)
I also don't know what to make of the random offset in xul.dll. If somebody more experienced than me with crashing ( :-D ) has ideas, I would love to hear them.
Comment 9•15 years ago
|
||
Ted, can you help with the "random offset in xul.dll" part?
This crash is new in 3.6 Beta 3. This is almost certainly related to turning on the Trackpoint scrolling for Alps drivers as that was the only change in this code between Beta 2 and Beta 3. The R61 has an Alps Touchpad.
David: What Thinkpad model do you have?
Blocks: 528267
(In reply to comment #10)
> This crash is new in 3.6 Beta 3. This is almost certainly related to turning
> on the Trackpoint scrolling for Alps drivers as that was the only change in
> this code between Beta 2 and Beta 3. The R61 has an Alps Touchpad.
Scratch that, it does show up earlier.
No longer blocks: 528267
Comment 12•15 years ago
|
||
(In reply to comment #9)
> Ted, can you help with the "random offset in xul.dll" part?
Probably a symptom of bug 526484, if I had to guess. (That patch is nominated for branch approval.) If you look at the frame above it:
20 xul.dll nsWindow::OnScroll widget/src/windows/nsWindow.cpp:5965
21 xul.dll xul.dll@0x42cc36
22 xul.dll nsWindow::HandleScrollingPlugins widget/src/windows/nsWindow.cpp:5944
It's clearly calling nsWindow::ProcessMessage:
http://hg.mozilla.org/releases/mozilla-1.9.2/annotate/4c488520d1bf/widget/src/windows/nsWindow.cpp#l5944
And the only place that calls OnScroll is here:
http://mxr.mozilla.org/mozilla-central/source/widget/src/windows/nsWindow.cpp#4166
So that's probably your missing frame.
Reporter | ||
Comment 13•15 years ago
|
||
I do indeed have an R61.
Reporter | ||
Comment 14•15 years ago
|
||
I've found a narrower window for the cause of this crash.
Changeset ff084019260e was the first changeset in which Trackpoint scrolling didn't work. Prior to this changeset, scrolling worked fine, and the crash described above did not occur.
I then cherrypicked the six changesets which make up the "Trackpoint hack" onto this revision, (hg update -C -r ff084019260e followed by hg transplant 40cc465fb9be e264b5f6b874 5f2a6c970a96 03b2895110e0 55e73a35126e) and, with this patched version, scrolling worked once again, and I was able to reproduce the above crash. So, the regression is somewhere in these seven changesets.
Hope this helps, let me know if there's anything else I can do.
Reporter | ||
Comment 15•15 years ago
|
||
I'm still experiencing the crash in 3.6rc1.
http://crash-stats.mozilla.com/report/index/3738bdfa-997c-464e-88bd-95ff62100109
Reporter | ||
Comment 16•15 years ago
|
||
I just got an update for my Alps driver via Windows Update, and before the update, 3.6rc2 still crashed under the above circumstances, while after installing the update, the crash was fixed.
Is this good enough to close the bug?
Comment 17•15 years ago
|
||
that's great to hear, the only thing i'd want to know is which files it updated because if it's a file that is loaded into our process, and the updated version doesn't crash, we'd like to consider blocking the old version. I looked through your report from comment 15 and couldn't find something that I'd recognize as the driver :(.
I don't think this has been fixed in general though, it appears to have just moved.
Summary: Stack overflow crash related to scrolling with trackpad and plugins [@ nsWindow::HandleScrollingPlugins] → Stack overflow crash related to scrolling with trackpad and plugins [@ nsWindow::HandleScrollingPlugins] [@ _SEH_prolog4 ]
Masayuki, you said in Bug 483136 that it was possible for HandleScrollingPlugins to enter an infinite loop. That's what's happening here, but I don't see how it's happening. Any help would be appreciated :-)
Reporter | ||
Comment 20•15 years ago
|
||
Here's a "fake" crash report with my current set of modules. I didn't see any suspicious dlls either.
http://crash-stats.mozilla.com/report/index/9e4f350e-4f97-4375-a31b-409942100119
Assignee | ||
Comment 21•15 years ago
|
||
(In reply to comment #19)
> Masayuki, you said in Bug 483136 that it was possible for
> HandleScrollingPlugins to enter an infinite loop. That's what's happening
> here, but I don't see how it's happening. Any help would be appreciated :-)
The reason is that mInScrollProcessing is an instance member. And the caller's one is set to TRUE. So, the called instance's one is FALSE...
But the code is going to be removed completely by next patch of bug 483136. So, I hope that this should be fixed only on branch.
Assignee | ||
Comment 22•15 years ago
|
||
(In reply to comment #21)
> So, the called instance's one is FALSE...
Er, the other instances' are FALSE.
Assignee | ||
Comment 23•15 years ago
|
||
I think that this patch fixes this bug. Can somebody test this if I prepare a test build?
Assignee | ||
Comment 24•15 years ago
|
||
ssssmemyself: Can you test the tryserver build Masayuki posted with the old drivers?
Reporter | ||
Comment 26•15 years ago
|
||
I might be able to; I'll look around for the old driver today.
Reporter | ||
Comment 27•15 years ago
|
||
I downgraded to the factory version of the driver, and the above tryserver build crashes.
Assignee | ||
Comment 28•15 years ago
|
||
(In reply to comment #27)
> I downgraded to the factory version of the driver, and the above tryserver
> build crashes.
Thank you, I'll be looking for another possibility.
And I hope that you test this test build but this is trunk build, be careful.
https://build.mozilla.org/tryserver-builds/masayuki@d-toybox.com-bug483136_v4.3/
Assignee | ||
Comment 29•15 years ago
|
||
If you sent the crash report with my test build, would you tell me the report ID?
Reporter | ||
Comment 30•15 years ago
|
||
When I crashed your build, I just got the standard Windows dialog box, not the Mozilla crash reporter.
With the trunk build and the old drivers, I am unable to reproduce the crash.
Assignee | ||
Comment 31•15 years ago
|
||
O.K. Thank you.
Comment 32•15 years ago
|
||
Assignee | ||
Comment 33•15 years ago
|
||
https://build.mozilla.org/tryserver-builds/masayuki@d-toybox.com-bug531341_v1.1_192/
Would you test this build? I found a mistake of my patch.
Reporter | ||
Comment 34•15 years ago
|
||
The v1.1 build crashes too, see attached WinDbg log.
Attachment #423245 -
Attachment mime type: application/octet-stream → text/plain
Assignee | ||
Comment 35•15 years ago
|
||
Building test builds for this patch...
Attachment #422756 -
Attachment is obsolete: true
Assignee | ||
Comment 36•15 years ago
|
||
Reporter | ||
Comment 37•15 years ago
|
||
I can't reproduce the crash with this latest build. Hooray!
Assignee | ||
Comment 38•15 years ago
|
||
(In reply to comment #37)
> I can't reproduce the crash with this latest build. Hooray!
Thank you very much!
Assignee | ||
Comment 39•15 years ago
|
||
Comment on attachment 423306 [details] [diff] [review]
Patch v2.0
The mInScrollProcessing of the *caller* is set to TRUE. So, it's strange. We can move it to static variable and all instances should refer it.
When it's processing, we don't need to retargetting processing window, so, we can just return from the method at that time.
Attachment #423306 -
Flags: review?(VYV03354)
Assignee | ||
Updated•15 years ago
|
Assignee: me → masayuki
Status: NEW → ASSIGNED
Comment 40•15 years ago
|
||
This is the kind of thing that could really use a testcase. Can we synthesize native scrolling events to test this the way we synthesize mousemove events?
Flags: in-testsuite?
Asking for blocking per http://groups.google.com/group/mozilla.dev.planning/browse_thread/thread/71972f1a7c5eb6f1 since Masayuki has a patch in hand.
Updated•15 years ago
|
Attachment #423306 -
Flags: review?(VYV03354) → review+
Assignee | ||
Comment 42•15 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/57f9a74ab7d3
O.K. I pushed to trunk, we can watch the regression if there is.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 43•15 years ago
|
||
(In reply to comment #40)
> This is the kind of thing that could really use a testcase. Can we synthesize
> native scrolling events to test this the way we synthesize mousemove events?
Currently no. Roc hates that we add APIs only for testing, see bug 528396.
I hate adding large APIs that get shipped in our product but are only useful for testing.
Assignee | ||
Updated•15 years ago
|
Target Milestone: --- → mozilla1.9.3a1
Reporter | ||
Comment 45•15 years ago
|
||
I think the Alps driver sends scrolling events from the apoint.exe process. Would it be possible to write a unit test that compiles an external program, runs it alongside Firefox, and sends the scrolling events to it? This still sounds like a lot of code, but this way it wouldn't need to ship.
Assignee | ||
Comment 46•15 years ago
|
||
Roc:
I cannot read the discussion of all on Google (http://groups.google.ca/group/mozilla.dev.platform/browse_frm/thread/f6e625cdf06246ed#). Can I read the other emails somewhere else? I think that they are same things that we add a few APIs each new testcase and many APIs only for one testcase. I hope we implement new APIs which are not compiled on release build.
ssssmemyself:
We should be able to do because we can send the message ourselves. However, I don't have the idea how to create the message loop actually.
This is currently the #51 and #56 topcrashes for 3.6 accounting for 0.42% of all crashes. I haven't seen a crash on trunk since Masayuki's patch landed but there's a very low volume there (only 12 or so over the last two weeks) so it's hard to say for sure that we've squashed all of them. This is definitely worth taking in 3.6.2 or whatever the first point release is going to be.
Assignee | ||
Comment 48•15 years ago
|
||
Comment on attachment 423306 [details] [diff] [review]
Patch v2.0
We don't have any regression reports. This should be safe.
This fixes a crash and a regression of bug 487245 which causes bug 531341 and a part of bug 483136.
Attachment #423306 -
Flags: approval1.9.2.1?
Assignee | ||
Comment 49•15 years ago
|
||
(In reply to comment #48)
> This fixes a crash and a regression of bug 487245 which causes bug 531341 and a
> part of bug 483136.
s/bug 531341/bug 541309
And this is now the #49, #55 and #92 windows topcrashes which puts it at #11 if all the signatures are combined.
Summary: Stack overflow crash related to scrolling with trackpad and plugins [@ nsWindow::HandleScrollingPlugins] [@ _SEH_prolog4 ] → Stack overflow crash related to scrolling with trackpad and plugins [@ nsWindow::HandleScrollingPlugins] [@ _SEH_prolog4 ][@ nsWindow::ProcessMessage(unsigned int, unsigned int&, long&, long*)][@ FindNCHit ]
Updated•15 years ago
|
blocking1.9.2: ? → needed
Comment 55•15 years ago
|
||
Considering the number of dupes, shouldn't this be pushed to the 1.9.2 branch too? Is it risky?
Updated•15 years ago
|
Keywords: regression
Comment 56•15 years ago
|
||
Comment on attachment 423306 [details] [diff] [review]
Patch v2.0
a1922=beltzner
Attachment #423306 -
Flags: approval1.9.2.2? → approval1.9.2.2+
Assignee | ||
Comment 57•15 years ago
|
||
Comment 58•15 years ago
|
||
I take it that we still have no way to reproduce this issue for testing?
Comment 59•15 years ago
|
||
mine is reproducible https://bugzilla.mozilla.org/show_bug.cgi?id=546624
Comment 60•15 years ago
|
||
(In reply to comment #59)
> mine is reproducible https://bugzilla.mozilla.org/show_bug.cgi?id=546624
I don't have access to a Lenovo laptop. Can you try with a nightly 3.6 build from http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-1.9.2/ and see if it fixes the problem for you?
Comment 61•15 years ago
|
||
yep seems fine for me, no crash at all and working fine
Comment 62•15 years ago
|
||
Build 20100304 3.6.2pre from latest-mozilla-1.9.2 is working well for me too. No crashes yet, and they are easy to reproduce with 3.6.1 on my Thinkpad T40p w/ a Synaptics touchpad.
Verifying per comment 62
Status: RESOLVED → VERIFIED
Assignee | ||
Comment 64•15 years ago
|
||
Thank you for the verifying!
Updated•15 years ago
|
Keywords: verified1.9.2
Updated•13 years ago
|
Crash Signature: [@ nsWindow::HandleScrollingPlugins]
[@ _SEH_prolog4 ]
[@ nsWindow::ProcessMessage(unsigned int, unsigned int&, long&, long*)]
[@ FindNCHit ]
You need to log in
before you can comment on or make changes to this bug.
Description
•