Closed Bug 723133 Opened 12 years ago Closed 12 years ago

Crash Report [@ PluginWndProcInternal ] stack overflow from infinite recursion

Categories

(Core Graveyard :: Plug-ins, defect)

9 Branch
All
Windows 7
defect
Not set
critical

Tracking

(firefox13+, firefox14+ verified, firefox15 verified, firefox-esr1015+ verified)

VERIFIED FIXED
mozilla16
Tracking Status
firefox13 + ---
firefox14 + verified
firefox15 --- verified
firefox-esr10 15+ verified

People

(Reporter: divjot94, Assigned: jimm)

References

()

Details

(5 keywords, Whiteboard: [flash-11.3])

Crash Data

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 6.2; rv:13.0a1) Gecko/20120201 Firefox/13.0a1
Build ID: 20120201031146

Steps to reproduce:

Opened firefox and searched for a video on youtube 


Actual results:

Firefox gave 2 "Not Responding" hangs and then crashed 


Expected results:

not the above...
https://crash-stats.mozilla.com/report/index/bp-34007b90-4622-497f-9a55-8999f2120201

https://crash-stats.mozilla.com/report/index/564ee344-899f-4618-9e94-91e742120130
"Reloading a web page that contains a java applet and a flash control."

I just cleaned my profile this morning (prior to the crash) so i don't think its bad profile problem... 

Idk if i am right , but I suspect Bug 90268 to be the cause (no technical basis but seems related)
Summary: Firefox 12.0a1 Crash Report [@ PluginWndProcInternal ] → Firefox 13.0a1 Crash Report [@ PluginWndProcInternal ]
https://crash-stats.mozilla.com/report/index/bp-34007b90-4622-497f-9a55-8999f2120201

https://crash-stats.mozilla.com/report/index/564ee344-899f-4618-9e94-91e742120130
"Reloading a web page that contains a java applet and a flash control."

I just cleaned my profile this morning (prior to the crash) so i don't think its bad profile problem... 

Idk if i am right , but I suspect Bug 90268 to be the cause (no technical basis but seems related)
Crash Signature: PluginWndProcInternal
OS: Windows NT → Windows 8
Group: core-security
Crash Signature: PluginWndProcInternal → [@ PluginWndProcInternal ]
Severity: normal → critical
Status: UNCONFIRMED → NEW
Component: Untriaged → Plug-ins
Ever confirmed: true
Keywords: crash
Product: Firefox → Core
QA Contact: untriaged → plugins
(In reply to bogas04 from comment #2)
> Idk if i am right , but I suspect Bug 90268 to be the cause (no technical
> basis but seems related)
It was existing before bug 90268 landed but there's a spike in crashes from its landing.

It happens on 64-bit Windows 7 and 32-bit Windows 8 (not officially supported).
It's #7 top crasher in 13.0a1.

The stack is basic:
Frame 	Module 	Signature 	Source
0 	xul.dll 	PluginWndProcInternal 	dom/plugins/base/nsPluginNativeWindowWin.cpp:227

More reports at:
https://crash-stats.mozilla.com/report/list?signature=PluginWndProcInternal
Blocks: 90268
Keywords: regression
OS: Windows 8 → Windows 7
Hardware: x86 → x86_64
Here is the stack on 32-bit Windows:
Frame 	Module 	Signature [Expand] 	Source
0 	xul.dll 	PluginWndProcInternal 	dom/plugins/base/nsPluginNativeWindowWin.cpp:226
1 	xul.dll 	PluginWndProc 	dom/plugins/base/nsPluginNativeWindowWin.cpp:383
2 	xul.dll 	nsWyciwygChannel::AsyncOpen 	netwerk/protocol/wyciwyg/nsWyciwygChannel.cpp:418
3 	user32.dll 	UserCallWinProcCheckWow 	
4 	user32.dll 	CallWindowProcAorW 	
5 	user32.dll 	CallWindowProcW 	
6 	xul.dll 	PluginWndProcInternal 	dom/plugins/base/nsPluginNativeWindowWin.cpp:354
7 	xul.dll 	CallWindowProcCrashProtected 	xpcom/base/nsCrashOnException.cpp:65
8 	xul.dll 	PluginWndProc 	dom/plugins/base/nsPluginNativeWindowWin.cpp:383
9 	xul.dll 	nsWyciwygChannel::AsyncOpen 	netwerk/protocol/wyciwyg/nsWyciwygChannel.cpp:418
10 	user32.dll 	UserCallWinProcCheckWow 	
11 	user32.dll 	CallWindowProcAorW 	
12 	user32.dll 	CallWindowProcW 	
and so on.
Hardware: x86_64 → All
All crash reports I checked have Adblock Plus.
Summary: Firefox 13.0a1 Crash Report [@ PluginWndProcInternal ] → Firefox 13.0a1 Crash Report [@ PluginWndProcInternal ] with Adblock Plus
Crash Signature: [@ PluginWndProcInternal ] → [@ PluginWndProcInternal] [@ NtUserEnumDisplayMonitors] [@ InternalFindAtom] [@ _SEH_prolog4]
Crash Signature: [@ PluginWndProcInternal] [@ NtUserEnumDisplayMonitors] [@ InternalFindAtom] [@ _SEH_prolog4] → [@ PluginWndProcInternal] [@ NtUserEnumDisplayMonitors] [@ InternalFindAtom] [@ _SEH_prolog4] [@ CallWindowProcCrashProtected]
Crash Signature: [@ PluginWndProcInternal] [@ NtUserEnumDisplayMonitors] [@ InternalFindAtom] [@ _SEH_prolog4] [@ CallWindowProcCrashProtected] → [@ PluginWndProcInternal] [@ NtUserEnumDisplayMonitors] [@ InternalFindAtom] [@ _SEH_prolog4] [@ CallWindowProcCrashProtected] [@ PluginWndProc]
It's #37 top browser crasher in 13.0.1 and 14.0b7, #51 in 15.0a2 and #58 in 16.0a1.

According to comments, it's related to Flash.
Keywords: topcrash
Crash Signature: [@ PluginWndProcInternal] [@ NtUserEnumDisplayMonitors] [@ InternalFindAtom] [@ _SEH_prolog4] [@ CallWindowProcCrashProtected] [@ PluginWndProc] → [@ PluginWndProcInternal] [@ RtlActivateActivationContextUnsafeFast | UserCallWinProcCheckWow ] [@ NtUserEnumDisplayMonitors] [@ InternalFindAtom] [@ _SEH_prolog4] [@ CallWindowProcCrashProtected] [@ PluginWndProc]
Let's grab some URLs, and QA ABP/Win7.

Also including Wladimir so that he's aware of the issue.
(In reply to bogas04 from comment #2)
> https://crash-stats.mozilla.com/report/index/bp-34007b90-4622-497f-9a55-
> 8999f2120201
> 
> https://crash-stats.mozilla.com/report/index/564ee344-899f-4618-9e94-
> 91e742120130
> "Reloading a web page that contains a java applet and a flash control."
> 
> I just cleaned my profile this morning (prior to the crash) so i don't think
> its bad profile problem... 
> 
> Idk if i am right , but I suspect Bug 90268 to be the cause (no technical
> basis but seems related)

Bogas - are you still able to reproduce this issue? If so, we may have some follow up questions.
https://www.xforex.com/trade/goToLoginPage.action

Beta/14

bp-edbaa716-133a-440f-b28a-60a5f2120622
  Crash Report [@ _SEH_prolog4 ] 
bp-ed089457-93b0-4e35-8c40-ce2752120622
  Crash Report [@ RtlActivateActivationContextUnsafeFast | UserCallWinProcCheckWow ] 

Aurora/15
bp-25bfeacc-b30a-4e44-b8cb-79d162120622
  Crash Report [@ RtlActivateActivationContextUnsafeFast | UserCallWinProcCheckWow ] 

Nightly/16
bp-19068b3b-22b4-48ef-807b-db04b2120622
  Crash Report [@ CallWindowProcCrashProtected ] 

Adblock not necessary.
Summary: Firefox 13.0a1 Crash Report [@ PluginWndProcInternal ] with Adblock Plus → Crash Report [@ PluginWndProcInternal ]
(In reply to Bob Clary [:bc:] from comment #9)
> https://www.xforex.com/trade/goToLoginPage.action
> 
> Beta/14
> 
> bp-edbaa716-133a-440f-b28a-60a5f2120622
>   Crash Report [@ _SEH_prolog4 ] 
> bp-ed089457-93b0-4e35-8c40-ce2752120622
>   Crash Report [@ RtlActivateActivationContextUnsafeFast |
> UserCallWinProcCheckWow ] 
> 
> Aurora/15
> bp-25bfeacc-b30a-4e44-b8cb-79d162120622
>   Crash Report [@ RtlActivateActivationContextUnsafeFast |
> UserCallWinProcCheckWow ] 
> 
> Nightly/16
> bp-19068b3b-22b4-48ef-807b-db04b2120622
>   Crash Report [@ CallWindowProcCrashProtected ] 
> 
> Adblock not necessary.

Thanks Bob! Given that, adding back qawanted and adding regressionwindow-wanted so that we can find the culprit and consider backing it out.

Already sending to Josh for investigation in the meantime, since bug 90268 is suspected as the culprit.
Assignee: nobody → joshmoz
Found regression between 20110914030906-20110915030845
Pushlog: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=cfd1e3533f0f&tochange=f2a2adaaacba
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2011/09/2011-09-14-03-09-06-mozilla-central/firefox-9.0a1.en-US.win32.zip
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2011/09/2011-09-15-03-08-45-mozilla-central/firefox-9.0a1.en-US.win32.zip

I don't think  bug 90268 is the regressor

For the Firefox 9 20110915030845 build

bp-3ea65c63-f624-403c-be11-7a9f82120625
0 	user32.dll 	_SEH_prolog4 	
1 	user32.dll 	CallWindowProcAorW 	
2 	user32.dll 	CallWindowProcW 	
3 	xul.dll 	xul.dll@0x797ed1 	
4 	xul.dll 	xul.dll@0x44b047 	
5 	xul.dll 	xul.dll@0xa5c92a 	
6 	user32.dll 	UserCallWinProcCheckWow 	
7 	user32.dll 	CallWindowProcAorW 	
8 	user32.dll 	CallWindowProcW

The previous build had a Flash plugin crash (with '262)
Version: 13 Branch → 9 Branch
Currently #32 top crash in 13.0.1 in the last week with 1854 crashes. #37 in 14.0b8 data and #35 in Aurora.
It's much more with combined signatures: #12 in 13.0.1, #15 in 14.0b8, #17 in 15.0a2.
This crash )or something related) is really hurting me (bp-2901783b-0b32-4476-b33f-7bfec2120628, bp-08b88139-893a-41ec-bd7c-265b12120629 among others).  Firefox 13.0.1.

Reliably crashes on one win7 x64 machine whenever I visit workflowy.com (and am already logged in).  Disabling the Shockwave Flash plug-in seems to stop the crash happening.

On my other win7 x64 machine, visiting workflowy.com reliably hangs the browser (I have to kill the browser, then the next few restarts also hang) and I don't get to send a crash report.  Again, disabling Flash solves the issue.  I'm getting a lot of hangs with 13.0.1.  Bad enough that I'm considering switching browser...  So I think this issue might be having more impact than the crash stats would suggest.
Did this get worse with Flash 11.3?

JimM, I know that in the past we had some checks in place to make sure that this didn't happen, but it appears that those checks are limited to realplayer: http://hg.mozilla.org/releases/mozilla-release/annotate/f48d675ffa9f/dom/plugins/base/nsPluginNativeWindowWin.cpp#l237

Do you think that we could just assert/fail if win->GetWindowProc() == PluginWndProcInternal?
Summary: Crash Report [@ PluginWndProcInternal ] → Crash Report [@ PluginWndProcInternal ] stack overflow from infinite recursion
I've seen PluginWndProcInternal CallWindowProcCrashProtected PluginWndProc PluginWndProcInternal CallWindowProcCrashProtected 140 times on Windows 7 on all three branches in crash automation beginning around 6/20. The 7 urls are: 2 xforex.com and 3 workflowy.com and 1 victoriassecret.com.
PS re comment 16: that makes it probably related to the Flash 11.3.300.262 update.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #15)
> Did this get worse with Flash 11.3?
> 
> JimM, I know that in the past we had some checks in place to make sure that
> this didn't happen, but it appears that those checks are limited to
> realplayer:
> http://hg.mozilla.org/releases/mozilla-release/annotate/f48d675ffa9f/dom/
> plugins/base/nsPluginNativeWindowWin.cpp#l237
> 
> Do you think that we could just assert/fail if win->GetWindowProc() ==
> PluginWndProcInternal?

It's very odd that this happens in the main process. It means somebody is mucking with the parent widget window subclass and things are getting out of sync.

I'll try to reproduce.
Whiteboard: [flash-11.3]
Attached patch catch patchSplinter Review
We've run into this before in the plugin container and in those cases this problem was self healing. So I've put together a patch that shunts to DefWindowProc when this is detected.

I fired off try builds which will report back here when complete. If anyone can reproduce this consistently please take the try builds for a spin to see if this addresses the problem.

On my system I was not able to reproduce and I do not see any activity in the subclassing here using the latest flash.
I've got a reproducible test case and am reducing it now.
Attached file testcase
Still some jquery cruft but as far as I'm going to take it for now.

1. load testcase
2. exit browser
3. stack overflow

If you run it enough times, you'll see the WER for Flash.
Keywords: testcase
(In reply to Bob Clary [:bc:] from comment #21)
> Created attachment 638254 [details]
> testcase
> 
> Still some jquery cruft but as far as I'm going to take it for now.
> 
> 1. load testcase
> 2. exit browser
> 3. stack overflow
> 
> If you run it enough times, you'll see the WER for Flash.

alternate STR:

1) open test case in a tab
2) open a new tab
3) close the test case tab

bc could you plz test with the fix I posted? Try builds are here - 

https://tbpl.mozilla.org/?noignore=0&tree=Try&rev=30ae38da9830
Assignee: joshmoz → jmathies
Attachment #637929 - Flags: review?(joshmoz)
(In reply to Jim Mathies [:jimm] from comment #22)
> (In reply to Bob Clary [:bc:] from comment #21)

> bc could you plz test with the fix I posted? Try builds are here - 

tbpl won't complete loading for me. It just gets stuck at 9% before failing.
(In reply to Bob Clary [:bc:] from comment #23)
> (In reply to Jim Mathies [:jimm] from comment #22)
> > (In reply to Bob Clary [:bc:] from comment #21)
> 
> > bc could you plz test with the fix I posted? Try builds are here - 
> 
> tbpl won't complete loading for me. It just gets stuck at 9% before failing.

here are the opt builds:

https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmathies@mozilla.com-30ae38da9830/
Both opt and debug builds appear to work fine for me on the testcase.
(In reply to Bob Clary [:bc:] from comment #25)
> Both opt and debug builds appear to work fine for me on the testcase.

Thanks for testing! I wasn't able to reproduce with this patch either.
Removing qawanted for the moment, as STR and a testcase has already been found by bc.
Keywords: qawanted
Comment on attachment 637929 [details] [diff] [review]
catch patch

r=me, I think we should get this landed and backport onto all branches ASAP
Attachment #637929 - Flags: review+
Comment on attachment 637929 [details] [diff] [review]
catch patch

[Triage Comment]
Benjamin let us know that this was basically the equivalent of a null check. Let's land on Aurora 15 and Beta 14.
Attachment #637929 - Flags: approval-mozilla-beta+
Attachment #637929 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/mozilla-central/rev/46a9d74546bc
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla16
No crash loading the link in comment 9 and the STR in comment 22.
Verified fixed on FF 14b11: Mozilla/5.0 (Windows NT 6.1; rv:14.0) Gecko/20100101 Firefox/14.0
Verified fixed on FF 15b1 on Win 7.
Status: RESOLVED → VERIFIED
With combined signatures, it's #2 top browser crasher in 10.0.6 ESR.
Please nominate for ESR uplift, see https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more details if needed.
Sent email as well, this needs to be nominated and landed to ESR in the next couple of days.
Comment on attachment 637929 [details] [diff] [review]
catch patch

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
crash fix
User impact if declined: 
crashiness
Fix Landed on Version:
Landed 7/2 on central, aurora, and beta
Risk to taking this patch (and alternatives if risky): 
minimal - it protects us from silliness in the flash plugin.
Attachment #637929 - Flags: approval-mozilla-esr10?
Attachment #637929 - Flags: approval-mozilla-esr10? → approval-mozilla-esr10+
Are there esr checkin-needed watchers out there? I don't have this cloned locally but can if need be.
If you put checkin-needed on it, odds are good I'll find it ;-)
Keywords: checkin-needed
https://hg.mozilla.org/releases/mozilla-esr10/rev/b696066d7c63

Note that the second hunk (with the NS_TRY_SAFE_CALL_RETURN) was a little different than m-c/aurora/beta. Please confirm that the straight overwriting with the new logic was the right way to go.
(In reply to Ryan VanderMeulen from comment #41)
> https://hg.mozilla.org/releases/mozilla-esr10/rev/b696066d7c63
> 
> Note that the second hunk (with the NS_TRY_SAFE_CALL_RETURN) was a little
> different than m-c/aurora/beta. Please confirm that the straight overwriting
> with the new logic was the right way to go.

That's fine, we've removed a number of NS_TRY_SAFE_CALL_RETURN uses in plugin code as the code they produce is permanently disabled.

http://mxr.mozilla.org/mozilla-central/source/dom/plugins/base/nsPluginSafety.h#32
http://mxr.mozilla.org/mozilla-central/source/dom/plugins/base/nsPluginHost.cpp#212
Able to see the issue on 2012-08-16-03-45-00-mozilla-esr10 using the STR in comment 22 on Win 7 x86.
Verified fixed on 2012-08-18-03-45-00-mozilla-esr10.
Crash Signature: [@ PluginWndProcInternal] [@ RtlActivateActivationContextUnsafeFast | UserCallWinProcCheckWow ] [@ NtUserEnumDisplayMonitors] [@ InternalFindAtom] [@ _SEH_prolog4] [@ CallWindowProcCrashProtected] [@ PluginWndProc] → [@ PluginWndProcInternal] [@ RtlActivateActivationContextUnsafeFast | UserCallWinProcCheckWow ] [@ NtUserEnumDisplayMonitors] [@ InternalFindAtom] [@ _SEH_prolog4] [@ CallWindowProcCrashProtected] [@ PluginWndProc] [@ GlobalFindAtomW] [@ NtUserSet…
A similar signature (RtlActivateActivationContextUnsafeFast | UserCallWinProcCheckWow) has been spiking recently, with really similar crashes (stack overflows with infinite recursion).

See for example https://crash-stats.mozilla.com/report/index/9cc8c613-d20e-4db9-a9c8-ea7152160517.
Jim, do you think we should file a new bug here for the crash signature showing up (in 46.0.1) in comment 44? This may be worth investigating. What do you think?
Flags: needinfo?(jmathies)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #45)
> Jim, do you think we should file a new bug here for the crash signature
> showing up (in 46.0.1) in comment 44? This may be worth investigating. What
> do you think?

Based on the stack from that crash report, we should file a new bug, yes.
Flags: needinfo?(jmathies)
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: