[OOPP] Crash when visiting quakelive.com using FF 3.6.3plugin1 [@ xul.dll@0xae2b28]

RESOLVED FIXED

Status

()

Core
Plug-ins
--
critical
RESOLVED FIXED
8 years ago
7 years ago

People

(Reporter: Philip, Assigned: Benjamin Smedberg)

Tracking

({crash, verified1.9.2})

Other Branch
x86_64
Windows 7
crash, verified1.9.2
Points:
---

Firefox Tracking Flags

(blocking2.0 alpha4+, blocking1.9.2 .4+, status1.9.2 .4-fixed)

Details

(crash signature, URL)

Attachments

(2 attachments)

(Reporter)

Description

8 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.1; de; rv:1.9.2.3pre) Gecko/20100405 Firefox/3.6.3plugin1
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.1; de; rv:1.9.2.3pre) Gecko/20100405 Firefox/3.6.3plugin1

when Firefox is loading the Website, he crased totaly

Reproducible: Always

Steps to Reproduce:
1. start firefox 3.6.3plugin1
2. open www.quakelive.com
3. see the crash  :/
(Reporter)

Comment 2

8 years ago
bp-d65ed0e7-d371-4bfa-991e-fba5a2100409 (safe mode and without addons/plugins)

bp-e850ae3a-0a45-4a58-94b1-397932100409 (normal)
0  	xul.dll  	xul.dll@0xae2b28  	
1 	xul.dll 	CreateNPAPIPlugin 	modules/plugin/base/src/nsPluginHost.cpp:3947
2 	xul.dll 	nsPluginHost::GetPlugin 	modules/plugin/base/src/nsPluginHost.cpp:4032
Component: General → Plug-ins
Keywords: crash
Product: Firefox → Core
QA Contact: general → plugins
Version: unspecified → 1.9.2 Branch
(Reporter)

Comment 4

8 years ago
sorry, what does that mean?
(Reporter)

Comment 5

8 years ago
and this crash is only in 3.6.3plugin1... not in 3.6.3!
Duplicate of this bug: 558901
Status: UNCONFIRMED → NEW
Ever confirmed: true

Updated

8 years ago
Summary: browser crashed when i opening the URL → browser crashed when i opening the URL [@ xul.dll@0xae2b28]
Version: 1.9.2 Branch → Other Branch
Summary: browser crashed when i opening the URL [@ xul.dll@0xae2b28] → Crash when visiting quakelive.com using FF 3.6.3plugin1 [@ xul.dll@0xae2b28]
#3 crasher for 3.6.3plugin1. Most if not all of the reports have comments mentioning quakelive.
Summary: Crash when visiting quakelive.com using FF 3.6.3plugin1 [@ xul.dll@0xae2b28] → [OOPP] Crash when visiting quakelive.com using FF 3.6.3plugin1 [@ xul.dll@0xae2b28]
(Assignee)

Comment 8

8 years ago
This is worrisome because it's apparently some change in the plugin host. Trying to reproduce.
blocking1.9.2: --- → ?
blocking2.0: --- → alpha4+

Comment 9

8 years ago
philip:
UUID	d65ed0e7-d371-4bfa-991e-fba5a2100409
npquakezero.dll  	0.1.0.277  	6C5BA596189A4DFDB530BE62B823B9041  	npquakezero.pdb

safe mode doesn't disable plugins.

Comment 10

8 years ago
I'm going to hold off on setting this as blocking until more information is known. Leaving the blocking request though so we can look at it in triage
(Assignee)

Comment 11

8 years ago
I can reproduce this 100%. Something is being corrupted in the call to NP_Initialize. Still looking.

Comment 12

8 years ago
I can reproduce this 100% on Windows7 as well with the Quake Live plugin on Lorentz.
For additional info, when I view it on OS X with the Lorentz build, no crash.

Comment 13

8 years ago
I'm going to mark this as blocking while it is investigated more...seems pretty nasty at first blush.

Updated

8 years ago
blocking1.9.2: ? → .4+
status1.9.2: --- → wanted
(Assignee)

Comment 14

8 years ago
This is nasty:

* We're calling NP_Initialize from quakelive using the "correct" WINAPI calling convention, which is callee-pops. This is required by Flash and other plugins, see bug 525605
* The NP_Initialize for quakelive is written using cdecl, which is caller-pops. This leaves ESP off by 4 bytes. This is definitely absolutely a bug in quakelive.

However, this happens to work in previous versions of Firefox, probably because in prior versions of our code the compiler used a frame pointer. The assembly probably then reset ESP before the final return using EBP... trying to verify against a 3.6.3 build.

The workaround here, if we want to try it, is to either force MSVC to de-optimize PluginPRLibrary::NP_Initialize so that it always uses a frame pointer, or to use custom assembly to thunk so that either calling convention continues to work. The latter solution is probably best. In either case, I'm definitely not going to be able to test/post a patch tonight.
Assignee: nobody → benjamin

Comment 15

8 years ago
(In reply to comment #14)
> in prior versions of our code the compiler used a frame pointer

What changed this?

Comment 16

8 years ago
Thanks for the analysis bsmedberg. I'm going to leave this as blocking...it seems to me this bug is worth blowing through code freeze. If others disagree tomorrow, we'll remove the blocking flag and start the beta build w/o a fix.

Comment 17

8 years ago
http://www.quakelive.com/forum/showthread.php?t=43147 is the upstream "bug report"
(Assignee)

Comment 18

8 years ago
The reason we have a frame pointer in nsNPAPIPlugin::CreatePlugin in 3.6.3 is that modules/plugin/base/src is compiled with ENABLE_CXX_EXCEPTIONS, which requires a frame pointer. The function prolog is

// Creates the nsNPAPIPlugin object. One nsNPAPIPlugin object exists per plugin (not instance).
nsresult
nsNPAPIPlugin::CreatePlugin(const char* aFilePath, PRLibrary* aLibrary,
                            nsIPlugin** aResult)
{
6B23C1B2  push        8    
6B23C1B4  mov         eax,offset __ehhandler$?CreatePlugin@nsNPAPIPlugin@@SAIPBDPAUPRLibrary@@PAPAVnsIPlugin@@@Z (6B01C1F1h) 
6B23C1B9  call        _EH_prolog3 (6AFDC312h)
(Assignee)

Comment 19

8 years ago
Created attachment 438759 [details] [diff] [review]
Force the function to use a frame pointer, so that stdcall/cdecl both work, rev. 1
Attachment #438759 - Flags: review?(joshmoz)
Attachment #438759 - Flags: review?(jmuizelaar)
Comment on attachment 438759 [details] [diff] [review]
Force the function to use a frame pointer, so that stdcall/cdecl both work, rev. 1

>+// Some plugins on Windows, notably Quake Live, implement NP_Initialize using
>+// cdecl instead of the documented stdcall. In order to work around this,
>+// we force the caller to use a frame pointer.
>+#if defined(XP_WIN) && defined(_M_IX86)
>+#define ALLOCA_HACK void* foo = _alloca(gCompilerHack);
>+static int gCompilerHack;
>+#else
>+#define ALLOCA_HACK
>+#endif

Maybe we should call this CALLING_CONVENTION_HACK
instead of ALLOCA_HACK 'cause that's makes it more obvious that it's related to dealing with the calling convention instead of allocation. 

The gCompilerHack is a run-time zero to avoid optimization right? It could also use a better name or a comment. I'm also a little worried that the optimizer could still throw the alloca away. Can we add a test plugin that uses the wrong calling convention?
Attachment #438759 - Flags: review?(jmuizelaar) → review+
(Assignee)

Comment 21

8 years ago
Adding a new test plugin is a bit complicated for this. I think we should rely on manual/litmus testing for this for the time being until they can fix it for real. I can certainly rename the variables: I've verified that the compiler does not in fact optimize away the alloca.

Comment 22

8 years ago
Comment on attachment 438759 [details] [diff] [review]
Force the function to use a frame pointer, so that stdcall/cdecl both work, rev. 1

We should limit use of this hack to functions declared with OSCALL - those are WINAPI explicitly. The functions in the NPAPI function table are default calling convention (NPP_New is default, for example).

#if defined(__OS2__)
NPError OSCALL NP_GetPluginData(NPPluginData * pPluginData);
#endif
NPError OSCALL NP_GetEntryPoints(NPPluginFuncs* pFuncs);
NPError OSCALL NP_Initialize(NPNetscapeFuncs* bFuncs);
NPError OSCALL NP_Shutdown();

There are only 4 functions declared OSCALL I think.
(Assignee)

Comment 23

8 years ago
Created attachment 438780 [details] [diff] [review]
Followup
Attachment #438780 - Flags: review?(joshmoz)

Updated

8 years ago
Attachment #438759 - Flags: review?(joshmoz) → review-

Updated

8 years ago
Attachment #438780 - Flags: review?(joshmoz) → review+

Updated

8 years ago
Attachment #438780 - Flags: approval1.9.2.4+

Updated

8 years ago
Attachment #438759 - Flags: approval1.9.2.4+

Comment 24

8 years ago
a=LegNeato for 1.9.2.4. Please land as soon as possible
What was wrong with #pragma optimize( "y", off ) ?
(Assignee)

Comment 26

8 years ago
In other bugs I've tried that, but it appears to work erratically or not at all in PGO builds.
We have indeed noticed a number of issues with our NPAPI implementation and
will be rolling out fixes with our next QUAKELIVE plugin release. 

Thanks for the detailed report.
(Assignee)

Comment 28

8 years ago
http://hg.mozilla.org/mozilla-central/rev/a005f1749e5e
http://hg.mozilla.org/mozilla-central/rev/b6b799f5a0c6

http://hg.mozilla.org/releases/mozilla-1.9.2/rev/0121ed69d08e
Status: NEW → RESOLVED
Last Resolved: 8 years ago
status1.9.2: wanted → .4-fixed
Resolution: --- → FIXED
Comment on attachment 438759 [details] [diff] [review]
Force the function to use a frame pointer, so that stdcall/cdecl both work, rev. 1

>+ * The Initial Developer of the Original Code is
>+ *   Josh Aas <josh@mozilla.com>

This is incorrect. Mozilla employees should use "Mozilla Foundation" as the initial developer in their code. See http://weblogs.mozillazine.org/gerv/archives/2010/02/mpl_initial_developer_for_mozilla_employ.html for more information.
This cleanly repros on my Win7 VM pre-fix. I'll get the official 3.6.4 build
later today or tomorrow to verify the fix.
No crash for me using Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.4) Gecko/20100413 Firefox/3.6.4 (.NET CLR 3.5.30729). Will check using Windows 7 as well if Al does not get a chance.
Using the same machine and the 3.6.4 candidate build, I no longer crash. Verified for 1.9.2.

 Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.4) Gecko/20100413 Firefox/3.6.4
Keywords: verified1.9.2
No longer depends on: 559703
Crash Signature: [@ xul.dll@0xae2b28]
You need to log in before you can comment on or make changes to this bug.