Last Comment Bug 723133 - Crash Report [@ PluginWndProcInternal ] stack overflow from infinite recursion
: Crash Report [@ PluginWndProcInternal ] stack overflow from infinite recursion
Status: VERIFIED FIXED
[flash-11.3]
: crash, regression, reproducible, testcase, topcrash
Product: Core
Classification: Components
Component: Plug-ins (show other bugs)
: 9 Branch
: All Windows 7
: -- critical with 4 votes (vote)
: mozilla16
Assigned To: Jim Mathies [:jimm]
:
Mentors:
https://www.xforex.com/trade/goToLogi...
Depends on:
Blocks: 532972 90268
  Show dependency treegraph
 
Reported: 2012-02-01 08:37 PST by bogas04
Modified: 2016-05-17 09:18 PDT (History)
19 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
+
+
verified
verified
15+
verified


Attachments
catch patch (2.70 KB, patch)
2012-06-29 09:25 PDT, Jim Mathies [:jimm]
benjamin: review+
akeybl: approval‑mozilla‑aurora+
akeybl: approval‑mozilla‑beta+
lukasblakk+bugs: approval‑mozilla‑esr10+
Details | Diff | Splinter Review
testcase (5.56 KB, text/html)
2012-07-01 20:15 PDT, Bob Clary [:bc:]
no flags Details

Description bogas04 2012-02-01 08:37:41 PST
User Agent: Mozilla/5.0 (Windows NT 6.2; rv:13.0a1) Gecko/20120201 Firefox/13.0a1
Build ID: 20120201031146

Steps to reproduce:

Opened firefox and searched for a video on youtube 


Actual results:

Firefox gave 2 "Not Responding" hangs and then crashed 


Expected results:

not the above...
Comment 1 bogas04 2012-02-01 08:40:56 PST
https://crash-stats.mozilla.com/report/index/bp-34007b90-4622-497f-9a55-8999f2120201

https://crash-stats.mozilla.com/report/index/564ee344-899f-4618-9e94-91e742120130
"Reloading a web page that contains a java applet and a flash control."

I just cleaned my profile this morning (prior to the crash) so i don't think its bad profile problem... 

Idk if i am right , but I suspect Bug 90268 to be the cause (no technical basis but seems related)
Comment 2 bogas04 2012-02-01 08:43:02 PST
https://crash-stats.mozilla.com/report/index/bp-34007b90-4622-497f-9a55-8999f2120201

https://crash-stats.mozilla.com/report/index/564ee344-899f-4618-9e94-91e742120130
"Reloading a web page that contains a java applet and a flash control."

I just cleaned my profile this morning (prior to the crash) so i don't think its bad profile problem... 

Idk if i am right , but I suspect Bug 90268 to be the cause (no technical basis but seems related)
Comment 3 Scoobidiver (away) 2012-02-02 07:50:01 PST
(In reply to bogas04 from comment #2)
> Idk if i am right , but I suspect Bug 90268 to be the cause (no technical
> basis but seems related)
It was existing before bug 90268 landed but there's a spike in crashes from its landing.

It happens on 64-bit Windows 7 and 32-bit Windows 8 (not officially supported).
It's #7 top crasher in 13.0a1.

The stack is basic:
Frame 	Module 	Signature 	Source
0 	xul.dll 	PluginWndProcInternal 	dom/plugins/base/nsPluginNativeWindowWin.cpp:227

More reports at:
https://crash-stats.mozilla.com/report/list?signature=PluginWndProcInternal
Comment 4 Scoobidiver (away) 2012-02-07 01:45:41 PST
Here is the stack on 32-bit Windows:
Frame 	Module 	Signature [Expand] 	Source
0 	xul.dll 	PluginWndProcInternal 	dom/plugins/base/nsPluginNativeWindowWin.cpp:226
1 	xul.dll 	PluginWndProc 	dom/plugins/base/nsPluginNativeWindowWin.cpp:383
2 	xul.dll 	nsWyciwygChannel::AsyncOpen 	netwerk/protocol/wyciwyg/nsWyciwygChannel.cpp:418
3 	user32.dll 	UserCallWinProcCheckWow 	
4 	user32.dll 	CallWindowProcAorW 	
5 	user32.dll 	CallWindowProcW 	
6 	xul.dll 	PluginWndProcInternal 	dom/plugins/base/nsPluginNativeWindowWin.cpp:354
7 	xul.dll 	CallWindowProcCrashProtected 	xpcom/base/nsCrashOnException.cpp:65
8 	xul.dll 	PluginWndProc 	dom/plugins/base/nsPluginNativeWindowWin.cpp:383
9 	xul.dll 	nsWyciwygChannel::AsyncOpen 	netwerk/protocol/wyciwyg/nsWyciwygChannel.cpp:418
10 	user32.dll 	UserCallWinProcCheckWow 	
11 	user32.dll 	CallWindowProcAorW 	
12 	user32.dll 	CallWindowProcW 	
and so on.
Comment 5 Scoobidiver (away) 2012-02-09 05:18:58 PST
All crash reports I checked have Adblock Plus.
Comment 6 Scoobidiver (away) 2012-06-20 06:35:20 PDT
It's #37 top browser crasher in 13.0.1 and 14.0b7, #51 in 15.0a2 and #58 in 16.0a1.

According to comments, it's related to Flash.
Comment 7 Alex Keybl [:akeybl] 2012-06-22 14:03:21 PDT
Let's grab some URLs, and QA ABP/Win7.

Also including Wladimir so that he's aware of the issue.
Comment 8 Alex Keybl [:akeybl] 2012-06-22 14:03:45 PDT
(In reply to bogas04 from comment #2)
> https://crash-stats.mozilla.com/report/index/bp-34007b90-4622-497f-9a55-
> 8999f2120201
> 
> https://crash-stats.mozilla.com/report/index/564ee344-899f-4618-9e94-
> 91e742120130
> "Reloading a web page that contains a java applet and a flash control."
> 
> I just cleaned my profile this morning (prior to the crash) so i don't think
> its bad profile problem... 
> 
> Idk if i am right , but I suspect Bug 90268 to be the cause (no technical
> basis but seems related)

Bogas - are you still able to reproduce this issue? If so, we may have some follow up questions.
Comment 9 Bob Clary [:bc:] 2012-06-22 16:11:39 PDT
https://www.xforex.com/trade/goToLoginPage.action

Beta/14

bp-edbaa716-133a-440f-b28a-60a5f2120622
  Crash Report [@ _SEH_prolog4 ] 
bp-ed089457-93b0-4e35-8c40-ce2752120622
  Crash Report [@ RtlActivateActivationContextUnsafeFast | UserCallWinProcCheckWow ] 

Aurora/15
bp-25bfeacc-b30a-4e44-b8cb-79d162120622
  Crash Report [@ RtlActivateActivationContextUnsafeFast | UserCallWinProcCheckWow ] 

Nightly/16
bp-19068b3b-22b4-48ef-807b-db04b2120622
  Crash Report [@ CallWindowProcCrashProtected ] 

Adblock not necessary.
Comment 10 Alex Keybl [:akeybl] 2012-06-24 12:40:24 PDT
(In reply to Bob Clary [:bc:] from comment #9)
> https://www.xforex.com/trade/goToLoginPage.action
> 
> Beta/14
> 
> bp-edbaa716-133a-440f-b28a-60a5f2120622
>   Crash Report [@ _SEH_prolog4 ] 
> bp-ed089457-93b0-4e35-8c40-ce2752120622
>   Crash Report [@ RtlActivateActivationContextUnsafeFast |
> UserCallWinProcCheckWow ] 
> 
> Aurora/15
> bp-25bfeacc-b30a-4e44-b8cb-79d162120622
>   Crash Report [@ RtlActivateActivationContextUnsafeFast |
> UserCallWinProcCheckWow ] 
> 
> Nightly/16
> bp-19068b3b-22b4-48ef-807b-db04b2120622
>   Crash Report [@ CallWindowProcCrashProtected ] 
> 
> Adblock not necessary.

Thanks Bob! Given that, adding back qawanted and adding regressionwindow-wanted so that we can find the culprit and consider backing it out.

Already sending to Josh for investigation in the meantime, since bug 90268 is suspected as the culprit.
Comment 11 Bob Clary [:bc:] 2012-06-24 19:36:54 PDT
Found regression between 20110914030906-20110915030845
Pushlog: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=cfd1e3533f0f&tochange=f2a2adaaacba
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2011/09/2011-09-14-03-09-06-mozilla-central/firefox-9.0a1.en-US.win32.zip
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2011/09/2011-09-15-03-08-45-mozilla-central/firefox-9.0a1.en-US.win32.zip

I don't think  bug 90268 is the regressor

For the Firefox 9 20110915030845 build

bp-3ea65c63-f624-403c-be11-7a9f82120625
0 	user32.dll 	_SEH_prolog4 	
1 	user32.dll 	CallWindowProcAorW 	
2 	user32.dll 	CallWindowProcW 	
3 	xul.dll 	xul.dll@0x797ed1 	
4 	xul.dll 	xul.dll@0x44b047 	
5 	xul.dll 	xul.dll@0xa5c92a 	
6 	user32.dll 	UserCallWinProcCheckWow 	
7 	user32.dll 	CallWindowProcAorW 	
8 	user32.dll 	CallWindowProcW

The previous build had a Flash plugin crash (with '262)
Comment 12 Marcia Knous [:marcia - use ni] 2012-06-25 05:12:03 PDT
Currently #32 top crash in 13.0.1 in the last week with 1854 crashes. #37 in 14.0b8 data and #35 in Aurora.
Comment 13 Scoobidiver (away) 2012-06-25 05:31:23 PDT
It's much more with combined signatures: #12 in 13.0.1, #15 in 14.0b8, #17 in 15.0a2.
Comment 14 Steffan Corley 2012-06-29 02:14:32 PDT
This crash )or something related) is really hurting me (bp-2901783b-0b32-4476-b33f-7bfec2120628, bp-08b88139-893a-41ec-bd7c-265b12120629 among others).  Firefox 13.0.1.

Reliably crashes on one win7 x64 machine whenever I visit workflowy.com (and am already logged in).  Disabling the Shockwave Flash plug-in seems to stop the crash happening.

On my other win7 x64 machine, visiting workflowy.com reliably hangs the browser (I have to kill the browser, then the next few restarts also hang) and I don't get to send a crash report.  Again, disabling Flash solves the issue.  I'm getting a lot of hangs with 13.0.1.  Bad enough that I'm considering switching browser...  So I think this issue might be having more impact than the crash stats would suggest.
Comment 15 Benjamin Smedberg [:bsmedberg] 2012-06-29 06:40:57 PDT
Did this get worse with Flash 11.3?

JimM, I know that in the past we had some checks in place to make sure that this didn't happen, but it appears that those checks are limited to realplayer: http://hg.mozilla.org/releases/mozilla-release/annotate/f48d675ffa9f/dom/plugins/base/nsPluginNativeWindowWin.cpp#l237

Do you think that we could just assert/fail if win->GetWindowProc() == PluginWndProcInternal?
Comment 16 Bob Clary [:bc:] 2012-06-29 07:03:32 PDT
I've seen PluginWndProcInternal CallWindowProcCrashProtected PluginWndProc PluginWndProcInternal CallWindowProcCrashProtected 140 times on Windows 7 on all three branches in crash automation beginning around 6/20. The 7 urls are: 2 xforex.com and 3 workflowy.com and 1 victoriassecret.com.
Comment 17 Bob Clary [:bc:] 2012-06-29 07:06:37 PDT
PS re comment 16: that makes it probably related to the Flash 11.3.300.262 update.
Comment 18 Jim Mathies [:jimm] 2012-06-29 07:26:55 PDT
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #15)
> Did this get worse with Flash 11.3?
> 
> JimM, I know that in the past we had some checks in place to make sure that
> this didn't happen, but it appears that those checks are limited to
> realplayer:
> http://hg.mozilla.org/releases/mozilla-release/annotate/f48d675ffa9f/dom/
> plugins/base/nsPluginNativeWindowWin.cpp#l237
> 
> Do you think that we could just assert/fail if win->GetWindowProc() ==
> PluginWndProcInternal?

It's very odd that this happens in the main process. It means somebody is mucking with the parent widget window subclass and things are getting out of sync.

I'll try to reproduce.
Comment 19 Jim Mathies [:jimm] 2012-06-29 09:25:30 PDT
Created attachment 637929 [details] [diff] [review]
catch patch

We've run into this before in the plugin container and in those cases this problem was self healing. So I've put together a patch that shunts to DefWindowProc when this is detected.

I fired off try builds which will report back here when complete. If anyone can reproduce this consistently please take the try builds for a spin to see if this addresses the problem.

On my system I was not able to reproduce and I do not see any activity in the subclassing here using the latest flash.
Comment 20 Bob Clary [:bc:] 2012-06-29 10:22:46 PDT
I've got a reproducible test case and am reducing it now.
Comment 21 Bob Clary [:bc:] 2012-07-01 20:15:06 PDT
Created attachment 638254 [details]
testcase

Still some jquery cruft but as far as I'm going to take it for now.

1. load testcase
2. exit browser
3. stack overflow

If you run it enough times, you'll see the WER for Flash.
Comment 22 Jim Mathies [:jimm] 2012-07-02 03:40:30 PDT
(In reply to Bob Clary [:bc:] from comment #21)
> Created attachment 638254 [details]
> testcase
> 
> Still some jquery cruft but as far as I'm going to take it for now.
> 
> 1. load testcase
> 2. exit browser
> 3. stack overflow
> 
> If you run it enough times, you'll see the WER for Flash.

alternate STR:

1) open test case in a tab
2) open a new tab
3) close the test case tab

bc could you plz test with the fix I posted? Try builds are here - 

https://tbpl.mozilla.org/?noignore=0&tree=Try&rev=30ae38da9830
Comment 23 Bob Clary [:bc:] 2012-07-02 05:20:22 PDT
(In reply to Jim Mathies [:jimm] from comment #22)
> (In reply to Bob Clary [:bc:] from comment #21)

> bc could you plz test with the fix I posted? Try builds are here - 

tbpl won't complete loading for me. It just gets stuck at 9% before failing.
Comment 24 Jim Mathies [:jimm] 2012-07-02 05:25:45 PDT
(In reply to Bob Clary [:bc:] from comment #23)
> (In reply to Jim Mathies [:jimm] from comment #22)
> > (In reply to Bob Clary [:bc:] from comment #21)
> 
> > bc could you plz test with the fix I posted? Try builds are here - 
> 
> tbpl won't complete loading for me. It just gets stuck at 9% before failing.

here are the opt builds:

https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmathies@mozilla.com-30ae38da9830/
Comment 25 Bob Clary [:bc:] 2012-07-02 05:46:19 PDT
Both opt and debug builds appear to work fine for me on the testcase.
Comment 26 Jim Mathies [:jimm] 2012-07-02 06:13:23 PDT
(In reply to Bob Clary [:bc:] from comment #25)
> Both opt and debug builds appear to work fine for me on the testcase.

Thanks for testing! I wasn't able to reproduce with this patch either.
Comment 27 Virgil Dicu [:virgil] [QA] 2012-07-02 06:41:44 PDT
Removing qawanted for the moment, as STR and a testcase has already been found by bc.
Comment 28 Benjamin Smedberg [:bsmedberg] 2012-07-02 09:51:07 PDT
Comment on attachment 637929 [details] [diff] [review]
catch patch

r=me, I think we should get this landed and backport onto all branches ASAP
Comment 30 Alex Keybl [:akeybl] 2012-07-02 13:34:29 PDT
Comment on attachment 637929 [details] [diff] [review]
catch patch

[Triage Comment]
Benjamin let us know that this was basically the equivalent of a null check. Let's land on Aurora 15 and Beta 14.
Comment 31 Ryan VanderMeulen [:RyanVM] 2012-07-02 15:30:24 PDT
https://hg.mozilla.org/mozilla-central/rev/46a9d74546bc
Comment 33 Paul Silaghi, QA [:pauly] 2012-07-05 07:14:56 PDT
No crash loading the link in comment 9 and the STR in comment 22.
Verified fixed on FF 14b11: Mozilla/5.0 (Windows NT 6.1; rv:14.0) Gecko/20100101 Firefox/14.0
Comment 34 Paul Silaghi, QA [:pauly] 2012-07-19 06:34:40 PDT
Verified fixed on FF 15b1 on Win 7.
Comment 35 Scoobidiver (away) 2012-07-20 09:34:29 PDT
With combined signatures, it's #2 top browser crasher in 10.0.6 ESR.
Comment 36 Lukas Blakk [:lsblakk] use ?needinfo 2012-08-02 16:10:21 PDT
Please nominate for ESR uplift, see https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more details if needed.
Comment 37 Lukas Blakk [:lsblakk] use ?needinfo 2012-08-16 16:31:18 PDT
Sent email as well, this needs to be nominated and landed to ESR in the next couple of days.
Comment 38 Jim Mathies [:jimm] 2012-08-16 16:59:08 PDT
Comment on attachment 637929 [details] [diff] [review]
catch patch

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
crash fix
User impact if declined: 
crashiness
Fix Landed on Version:
Landed 7/2 on central, aurora, and beta
Risk to taking this patch (and alternatives if risky): 
minimal - it protects us from silliness in the flash plugin.
Comment 39 Jim Mathies [:jimm] 2012-08-17 04:17:40 PDT
Are there esr checkin-needed watchers out there? I don't have this cloned locally but can if need be.
Comment 40 Ryan VanderMeulen [:RyanVM] 2012-08-17 06:49:49 PDT
If you put checkin-needed on it, odds are good I'll find it ;-)
Comment 41 Ryan VanderMeulen [:RyanVM] 2012-08-17 17:28:59 PDT
https://hg.mozilla.org/releases/mozilla-esr10/rev/b696066d7c63

Note that the second hunk (with the NS_TRY_SAFE_CALL_RETURN) was a little different than m-c/aurora/beta. Please confirm that the straight overwriting with the new logic was the right way to go.
Comment 42 Jim Mathies [:jimm] 2012-08-18 07:24:46 PDT
(In reply to Ryan VanderMeulen from comment #41)
> https://hg.mozilla.org/releases/mozilla-esr10/rev/b696066d7c63
> 
> Note that the second hunk (with the NS_TRY_SAFE_CALL_RETURN) was a little
> different than m-c/aurora/beta. Please confirm that the straight overwriting
> with the new logic was the right way to go.

That's fine, we've removed a number of NS_TRY_SAFE_CALL_RETURN uses in plugin code as the code they produce is permanently disabled.

http://mxr.mozilla.org/mozilla-central/source/dom/plugins/base/nsPluginSafety.h#32
http://mxr.mozilla.org/mozilla-central/source/dom/plugins/base/nsPluginHost.cpp#212
Comment 43 Paul Silaghi, QA [:pauly] 2012-08-20 07:16:38 PDT
Able to see the issue on 2012-08-16-03-45-00-mozilla-esr10 using the STR in comment 22 on Win 7 x86.
Verified fixed on 2012-08-18-03-45-00-mozilla-esr10.
Comment 44 Marco Castelluccio [:marco] 2016-05-17 05:29:29 PDT
A similar signature (RtlActivateActivationContextUnsafeFast | UserCallWinProcCheckWow) has been spiking recently, with really similar crashes (stack overflows with infinite recursion).

See for example https://crash-stats.mozilla.com/report/index/9cc8c613-d20e-4db9-a9c8-ea7152160517.
Comment 45 Liz Henry (:lizzard) (needinfo? me) 2016-05-17 08:54:52 PDT
Jim, do you think we should file a new bug here for the crash signature showing up (in 46.0.1) in comment 44? This may be worth investigating. What do you think?
Comment 46 Aaron Klotz [:aklotz] 2016-05-17 09:12:03 PDT
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #45)
> Jim, do you think we should file a new bug here for the crash signature
> showing up (in 46.0.1) in comment 44? This may be worth investigating. What
> do you think?

Based on the stack from that crash report, we should file a new bug, yes.
Comment 47 Liz Henry (:lizzard) (needinfo? me) 2016-05-17 09:18:57 PDT
Thanks Aaron, I filed bug 1273555.

Note You need to log in before you can comment on or make changes to this bug.