Closed Bug 57885 Opened 24 years ago Closed 24 years ago

crash in nsPluginHostImpl::GetPluginFactory with dual legacy/XPCOM plugin

Categories

(Core Graveyard :: Plug-ins, defect, P1)

PowerPC
Mac System 9.x
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: nhart, Assigned: bnesse)

References

()

Details

(Keywords: crash, Whiteboard: [branch accept][rtm-])

Attachments

(5 files)

With pre-release version of Mac RealPlayer plugin (implementing legacy and XPCOM 
plugin APIs) the browser crashes when it tries to obtain the plugin's 
NSGetFactory entrypoint.  The plugin properly registers itself, but the browser 
does not seem to get a valid mLibrary member for the plugin.  It crashes in 
PR_FindSymbol because the library is null.

extra info:
-----------
It seems that the plugin is properly registered (at least, it looks that way to 
me) but that mozilla's internal bookkeeping mechanisms for the plugins do not 
correctly set up the structure that contains the plugin's info.  It contains the 
mime type, plugin name, desc and filename, but the mLibrary member is null.  
PR_FindSymbol is unable to find the NSGetFactory entrypoint because the library 
is null, and then the browser crashes.

note:
-----
I will try to provide a sample plugin that assists in reproing this behavior as 
soon as possible.
Confirming to put on radar; Nick knows of what he writes. Marking rtm and
realplayer and crash. P1 as this is a crasher affecting high-profile partner
(Real, #1 plug-in on the web) and will inevitably affect high-profile sites that
use Real content.

Brian, Steve -- Any chance you could take this on while Andrei grapples with
Acrobat form submission? 
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: crash, realplayer, rtm
Priority: P3 → P1
I cannot reproduce this by the following steps. DId I miss something
1. Use 4.x download 
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=17934 , save it into 
the plugins directorya and name it "sample"
2. launch Mozilla
I don't see any crash. 
I also do the following
1. visit http://pages.prognet.com/onesies/nhart/test.html 
I don't see any crash
Do I need to install something first, if so where can I find them and how should 
I install them ?
Hum... Here is the problematic code that Nicholas descirbed-
2829 NS_IMETHODIMP nsPluginHostImpl::GetPluginFactory(const char *aMimeType, 
nsIPlugin** aPlugin)
2830 {
2831   nsresult rv = NS_ERROR_FAILURE;
2832   *aPlugin = NULL;
2833 
2834   if(!aMimeType)
2835     return NS_ERROR_ILLEGAL_VALUE;
2836 
2837   // If plugins haven't been scanned yet, do so now
2838   LoadPlugins();
2839 
2840   nsPluginTag* pluginTag;
2841   if((rv = FindPluginEnabledForType(aMimeType, pluginTag)) == NS_OK)
2842   {
2843 
2844 #ifdef XP_WIN // actually load a dll on Windows
2845 
2846 #ifdef NS_DEBUG
2847   printf("For %s found plugin %s\n", aMimeType, pluginTag->mFileName);
2848 #endif
2849 
2850     nsFileSpec file(pluginTag->mFileName);
2851 
2852     nsPluginFile pluginFile(file);
2853     PRLibrary* pluginLibrary = NULL;
2854 
2855     if (pluginFile.LoadPlugin(pluginLibrary) != NS_OK || pluginLibrary == 
NULL)
2856       return NS_ERROR_FAILURE;
2857 
2858     pluginTag->mLibrary = pluginLibrary;
2859 
2860 #endif

Notice that line 2858 is inside a 2844 #ifdef XP_WIN // actually load a dll on 
Windows 
ifdef. In other word, pluginTag->mLibrary maybe null in the other platforms.

Maybe we should add two lines in 2861 so we won't crash mac / Linux
if(nsnull == pluginTag->mLibrary)
   return NS_ERROR_FAILURE;

Since I cannot reproduce the crash, I don't feel comfortable to make such 
suggestion.  We should reproduce the crash first.

add amusli@netscape.com,beard@netscape.com,av@netscape.com,waterson@netscape.com 
to the cc list
can anyone else reproduce this?
The plugin is an XPCOM *and* legacy plugin and should go in the *components* 
directory-- NOT plugins.

Note: this is also blocking RealNetworks from further development since we 
can't get Mozilla to load our plugin as an XPCOM component without immediately 
crashing the browser.
added my program manager to the CC list
Escalating severity to blocker as this is blocking RealNetworks, a high-profile
partner, on work on RealPlayer for Mac; RealPlayer is the #1 most widely-used
plug-in.
Severity: critical → blocker
PDT would very much like to see a patch and risk assessment for this bug ASAP.
I haven't looked into this bug per se, but I did discover something yesterday on 
my own litte bug hunt. XPCOM plug-ins are cached into the Component Registry 
along with all the other components. If you remove a XPCOM plugin from the 
components directory, it still shows up in the list of mime types that can be 
handled by plug-ins. The net result of this is that the program crashes while 
trying to create a legacy plugin after failing to open the (non existent) XPCOM 
plugin. In this situation, GetPluginFactory is the calling function.
Brian's comments are interesting, and it may be somehow related to this bug, 
but the repro steps are different.  From my own debugging of the browser (which 
is limited since I really don't know much about it) it seems that our XPCOM 
component successfully registers itself (using 
nsIComponentManager::RegisterComponent *and* 
nsIPluginManager::RegisterPlugin.)  Then, without removing it from the 
components directory the browser crashes when it tries to create an instance of 
the plugin factory.
I forgot to mention one thing (or at least point it out). The crash occurs when 
you load a page which contains an embed for the removed plugin. The reason I 
mentioned this was in case your test case might have more than one plugin 
instance in it and the crash is occuring on the other embed.
In the provided test case there is only one EMBED, and the crash happens as the 
browser tries to load the component that contains the plugin for the EMBED's 
mime type.
Nick, could you put up a new attachment that is a .sit archive. The plugin you 
attached doesn't have a resource fork, thus I can't even get it to load. I hacked 
in a 'cfrg' resource from another plugin, which has me past registration, but I 
don't know how valid the test is this way.
It seems that the reason for the crash is because the plugin is attempting to 
register itself using the (as I understand it) depricated nsGetFactory method 
instead of using the nsGetModule method (which is what I understand to be the new 
way of doing it.) For the plug-in to work correctly using the nsGetFactory method 
it needs to be in the "Plug-ins" directory (the MRJ Plugin works this way.)

Nick, I have a Mac version of the "Simple" project if you want to see it. It's 
not very clean, but it "does the right thing".

Marking bug INVALID as per ekrock's request.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → INVALID
Attached file my npentry.cpp
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
I have attached a copy of our npentry.cpp that shows how we register our 
plugin.  This is almost identical to the sample plugin *and* it works fine on 
Windows.  I don't understand your comments about nsGetFactory or nsGetModule.  
We register our plugin with the component manager and plugin manager.  We do 
implement the NSGetFactory() entrypoint (just like the sample plugin) and I 
have heard nothing about replacing it with an NSGetModule() function.  Finally, 
the browser crashes when it tries to find a symbol in a shared library (our 
plugin) that it has not yet loaded.  So I also don't understand how our 
entrypoint relates to NSGetFactory/NSGetModule... it seems to me that there is 
a problem in the Mac implementation of the plugin host.

If this bug really is resolved and invalid then I'm going to need a little more 
information or guidance on how to fix it.
Reassigning to myself so I can find it easier :)
Assignee: av → bnesse
Status: REOPENED → NEW
The crash part of this seems to be fixable with a null pointer check.  Could
someone _please_ attach a patch like that to this bug and get reviews?  It's
still possible to get that type of patch into an RTM respin (if there is another.)

Adding [rtm need info] to encourage someone to answer previous question...
Whiteboard: [rtm need info]
There is some argument in this bug that this might be solvable on the plug in
side.  Is that the case?  Are there any workarounds for the user?
It was supposed to be solvable on the plug-in side, but there seem to be other 
issues which are making this a less than desirable solution. I believe I have a 
potential fix, but I am currently waiting for my pull to finish building before I 
can verify it.
Status: NEW → ASSIGNED
Attached patch Proposed patchSplinter Review
The proposed patch insures that the Library has been loaded on all platforms 
before attempting to use it. The previous implementation only loaded on Windows 
because it wasn't loaded previously in ScanPluginsDirectory().
I looked at the patch and it seems to be doing right thing: getting mLibrary 
when it is needed but not previously set. Let me test it first before giving r=.
To fix the windows crash, could you just add the null check rather than changing 
the IFDEFs?
It doesn't crash on Windows. If we add just null check we prevent it from 
crashing on Mac, but plugin wil not get loaded.

I tried the chenge on Windows and Linux, and it is working. The debug output 
string should probably be outside this new if() statement. r=av.
sr=waterson
Whiteboard: [rtm need info] → [rtm+] r=av sr=waterson
Fix checked in to trunk. All platforms now load the library if it wasn't loaded 
previously in ScanPluginDirectory().
Status: ASSIGNED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
Whiteboard: [rtm+] r=av sr=waterson
Keywords: vtrunk
Whiteboard: [rtm-]
Hate to beat a dead bug, but what about the other XP_WIN #ifdef in here:

http://lxr.mozilla.org/seamonkey/source/modules/plugin/nglsrc/nsPluginHostImpl.c
pp#2989
I think this is allright, as on Windows we don't need to load the dll in order 
to read the info off the plugin. Thus we save a significant amount of 
browser start up time. We should probably make Mac go this way too if reading 
the resource fork does not require loading the library. I am not sure. Brian?
Reading the resource fork works ok without calling LoadPlugin first. But the 
plugins subsequently fail to register correctly. I won't go into great detail, 
but in a nutshell... On the Mac the LoadPlugin() call in ScanPluginsDirectory() 
loads plugins in the "Plug-ins" directory and the LoadPlugin() call in 
GetPluginFactory() loads plugins in the "Components" directory. The code needs to 
remain intact.
I've just found out that why Mozilla0.7 is crashing on Solaris. It crashed at
the printf on line 2882 ofnsPluginHostImpl.cpp, because the second printing
item, pluginTag->mFileName, was NULL.

2881     #ifdef NS_DEBUG
2882       printf("For %s found plugin %s\n", aMimeType, pluginTag->mFileName);
2883     #endif

The printf used to be covered by a preceding "#ifdef XP_WIN", therefore was not
called in Solaris. But the statement was removed in this bug fix.
One possible solution would be to modify the printf statement to something like:

printf("For %s found plugin %s\n",                       
pluginTag->mFileName?pluginTag->mFileName:"NONE");

I made the change, and the browser came up OK. 
Should I write a new bug or re-open this bug? I am leaning towards writing a new
bug, since I believe the fix did resolve the targeted serious problem on Mac.

So... plugins on Solaris don't have filenames? There are 2 other references to 
pluginTag->mFileName inside that case statement. They both look to be potentially 
troublesome if this field is NULL. I believe further investigation is in order 
here.
After some investigation, it appears that the root of the problem can be traced 
to nsPluginFile::GetPluginInfo() in nsPluginsDirUnix.cpp. This function is not 
placing the file name into the nsPluginInfo stucture like it does on the Mac, 
Windows, and OS2 plaforms. The BeOS platform is failing to do this as well.

To answer the original question, I believe this should be opened up as a new bug 
(probably 2 bugs ;).
The UNIX version of this problem may already be addressed by bug 51376 jgaunt is 
looking into this possibility.
51376 seems to be a different problem.


It appears that pluginTag->mFileName can be NULL on Solaris, and it seems have
been working fine till now. 
The problem here is the side effect of a sound fix. The printf used to be call
only for Windows. If it is now changed to open it to all platforms, need to make
sure the change still works for all other platforms. If testing for all
platforms is too much, and the change is really needed in Mac (in addition
to Windows), then only open the code to Windows and Mac (or whatever the needed
ones), instead of opening the change to all platforms.
the problem stems from not getting the filename in nsPluginsDirUnix.cpp. For
some reason on Windows and Mac ( and BeOS I think too ), we get the filename.
This isn't done in the Unix file. It can be easily added by a call:
info.fFileName = PL_strdup(this->GetCString());
in the GetPluginInfo() method. I have tested this on HP with Mozilla and it
works ( the filename gets set starting at root - /builds/... ).

But I don't see where this field is needed. ??? It doesn't seem to change any of
the loading on HP of the default plugin ( don't have any others to try it on ),
except to set the filename.  We do not enter the block of code that was changed
in the fix for this bug ( in nsPluginHostImpl.cpp ), where the #ifdef ( XP_WIN )
was removed, so apparently the plugin is loading elsewhere successfully.
mFileName in nsPluginTag can be NULL, and in some cases it should be NULL. See
the following code in the same file:
...
// currently 'unwanted' plugins are Java, and all other plugins except
// RealPlayer, Acrobat, Flash
static PRBool isUnwantedPlugin(nsPluginTag * tag)
{
  if(tag->mFileName == nsnull)
    return PR_TRUE;
...

If we want to print something that can be NULL, we better cover the NULL case.

One more problem, currently in the trunk, the #ifdef seems being moved to the
right, instead of in the first colume:
2878   nsPluginTag* pluginTag;
2879   if((rv = FindPluginEnabledForType(aMimeType, pluginTag)) == NS_OK)
2880   {
2881     #ifdef NS_DEBUG
2882       printf("For %s found plugin %s\n", aMimeType, pluginTag->mFileName);
2883     #endif
2884 

#ifdef and #endif there should be (and had been) in the first colume.

in the isUnwantedPlugin() case, the filename should be NULL so that the plugin 
does not exist at all so far as mozilla is concerned.  I don't think that 
particular example has any relavance to the argument.
The relavance is that mFileName can be NULL. If that is true, then we need to
due with NULL in the printf.
My point was that when isUnwantedPlugin sets the name to null, nothing gets 
done with it, it's as if the pluginTag itself is nsnull.
To clarify Sean's point... IsUnwantedPlugin gets called from 
ScanPluginsDirectory. If the plugin is deemed to be "Unwanted" it is not added to 
the list of valid nsPluginTag items stored in "mPlugins".

The function call FindPluginEnabledForType (called from GetPluginFactory right 
before the printf) walks the "mPlugins" objects and therefore should only contain 
vaild nsPluginTag objects.

As far as I can tell, passing NULL to the nsFileSpec constructor two lines below 
the printf will guarantee a NS_ERROR_FAILURE condition if the library hasn't 
already been loaded. If we do get past this point there are two remaining 
options. First, if the plugin library has a NSGetFactory symbol, it will get 
Initialized. If on the other hand, it is a 4.x legacy plugin, it will either fail 
or crash in the "ns4xPlugin::CreatePlugin" call which also requires the filename 
parameter.

All things considered, I can't think of an instance where a valid plugin would 
not have a file name associated with it.
To get back to John's question. The plugins are getting loaded correctly by 
ScanPluginsDirectory() [called from LoadPlugins()]. This code does a directory 
iteration on the plugin directory, opens the plugin file, and then creates the 
nsPluginTag based on this information.

If the plugin library were ever to be unloaded (can this happen?) the only way to 
reload it (without scanning the entire plugin directory again) would be via the 
name stored in the mFileName field.
It appears that this issue was added to Bugzilla back in December as bug 62489. I 
will move the pertinent information to that bug.
Depends on: 62489
mFileName here has been null ,in this case for Java plugin, at least on Unix,
and it seems been working OK (till this change).

Below is the code before and after the fix. Why "#ifdef XP_WIN" was removed? If
the fix is for Mac only, then would it be better just to add this code to MAC,
instead of adding it to all platforms? By the way, "#ifdef NS_DEBUG" was shifted
to the right, should it be in the first colume?

Before the fix (1.84):
======================
...
    nsPluginTag* pluginTag;
    if((rv = FindPluginEnabledForType(aMimeType, pluginTag)) == NS_OK)
    {
#ifdef XP_WIN // actually load a dll on Windows

#ifdef NS_DEBUG
  printf("For %s found plugin %s\n", aMimeType, pluginTag->mFileName);
#endif

    nsFileSpec file(pluginTag->mFileName);

    nsPluginFile pluginFile(file);
    PRLibrary* pluginLibrary = NULL;

    if (pluginFile.LoadPlugin(pluginLibrary) != NS_OK || pluginLibrary == NULL)
      return NS_ERROR_FAILURE;

    pluginTag->mLibrary = pluginLibrary;

#endif
...


After the fix (1.85):
=====================
...
   nsPluginTag* pluginTag;
   if((rv = FindPluginEnabledForType(aMimeType, pluginTag)) == NS_OK)
   {
     #ifdef NS_DEBUG
       printf("For %s found plugin %s\n", aMimeType, pluginTag->mFileName);
     #endif
 
     if (nsnull == pluginTag->mLibrary)    // if we haven't done this yet
     {
       nsFileSpec file(pluginTag->mFileName);
 
       nsPluginFile pluginFile(file);
       PRLibrary* pluginLibrary = NULL;
 
       if (pluginFile.LoadPlugin(pluginLibrary) != NS_OK || pluginLibrary ==
NULL)
         return NS_ERROR_FAILURE;
 
       pluginTag->mLibrary = pluginLibrary;
     }
...
No longer depends on: 62489
Just because we have not run into ( or noticed ) major trouble with mFileName
being null to this point, doesn't mean it's right to not populate the field.
Developers should expect that the PluginInfo is fully populated no matter what
platform you are on. On all other platforms we populate this field. But I
digress since this bug is about a mac fix, not about Unix plugin FileName
problems, please post to bug 64289
Depends on: 62489
No longer depends on: 62489
Questions:
1) Why does Unix platform need mFilename here? Just because Windows might need
it, doesn't mean all other platform also needs it?
2) The side effects of the fix of this bug (adding the printf to Solaris) that
broke Mozilla0.7 on Solaris, and that is why all these comments are here. I
think we should be careful about side effects. The not needed (so far I have'nt
seen why it is needed on Unix) nFileName is null is not the cause of this crash,
the added printf statement is. Should we be more careful about ripple effects?
3) What about the shifted "#ifdef NS_DEBUG" statements by this fix? Should that
need be corrected? This is the third time I asked, but have not seen any
response. 

To wrap up, do we really want to expose this section of code to all platforms,
instead of keeping it to the ones that need it?
Answers:
1) As I have noted here and in bug 64289 there are numerous reasons why. Type 
"about:plugins" in the url bar. You will see a description of whatever plugins 
you have installed. On any other platform you will see the name of the plugin, 
the mimetypes it supports, and the path to its installation location. On the Unix 
builds you get the name and the mimetypes, but the installation path is 
conspicuously absent. This is because mFileName is NULL. This is a bug, not a 
feature.

Secondly, if the plugin library were to ever be unloaded, the information in 
mFileName would be the only way to reload the library without rescanning and 
reloading ALL of the plugins.

And finally, if any of the other code in this function which is dependent on 
mFileName was ever executed, it will have unpredictable and/or undesirable 
results.

2) While it may be technically true that adding the printf is what caused the 
crash, this is in no way proof that mFileName is not needed on Unix. See answer 
#1. Fixing debug code so it doesn't crash is not a solution. Fixing the 
underlying problem so that NO code crashes is the solution.

3) If by "shifted" you are refering to why is it now above the (nsnull == 
pluginTag->mLibrary) check. It is because after reviewing the original patch 
proposal, the module owner suggested moving it. See the attachments "Proposed 
Patch" and "Patch with #ifdef moved".

If you are refering to the fact that the #ifdef is no longer left justified in 
the file... Yes, I moved it in from the left margin, but this has no effect on 
the compiled code. It can certainly be moved back to the left margin if desired.
Adding branch accept to status whiteboard.
.Angela...
Whiteboard: [rtm-] → [branch accept][rtm-]
Regarding indented #ifdefs and other C pre-processor directives: before ANSI,
many Unix compilers (all?) did not support leading white space -- instead they
required that all directives start with # in column 1, even if there were white
space after the # and before the directive keyword.

I don't believe Mozilla supports any platforms with pre-ANSI-C compilers, or
with otherwise retrograde-buggy compilers.  But it would be good to run this
question by the mozilla.builds newsgroup, and to watch tinderbox for both
SeaMonkey and SeaMonkey-Ports after checkin.

/be
We should definetely get this into the branch, since Real people seem to expect 
their plugin to work in the next release. 
Patch checked into the branch, with the whitespace removed just in case.
Tested on mac trunk and branch builds(20010123). "about:plugins" lists the xpcom 
realplayer plugin (inside the components folder) and there is no crash loading 
the test url. However, no plugin controls appear and no sound is heard, hope 
that should not be a problem here since the plugin is not yet fully 
developed.Marking VERIFIED.
Status: RESOLVED → VERIFIED
Keywords: vtrunk
It seems to work now (at least, the browser isn't crashing as soon as it tries 
to create an instance of the plugin-- the browser is still crashing later 
during the life cycle of our plugin, which is a separate issue.)
Status: VERIFIED → CLOSED
reopening bug.(VERIFIED-FIXED) is the state bugs should be left in. 
Status: CLOSED → REOPENED
Resolution: FIXED → ---
marking RESOLVED/FIXED
Status: REOPENED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
Verified/ Fixed (sorry for the spam)
Status: RESOLVED → VERIFIED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: