Last Comment Bug 874475 - Firefox 22, 23 Error loading Extension's js-ctypes dylib in Mac OS X
: Firefox 22, 23 Error loading Extension's js-ctypes dylib in Mac OS X
Status: RESOLVED INVALID
:
Product: Firefox
Classification: Client Software
Component: Untriaged (show other bugs)
: 20 Branch
: x86 Mac OS X
: -- normal (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-05-21 08:39 PDT by Jerry Krinock
Modified: 2013-05-29 13:10 PDT (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
My Firefox Extension (106.78 KB, application/octet-stream)
2013-05-21 08:39 PDT, Jerry Krinock
no flags Details

Description Jerry Krinock 2013-05-21 08:39:45 PDT
Created attachment 752218 [details]
My Firefox Extension

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/536.30.1 (KHTML, like Gecko) Version/6.0.5 Safari/536.30.1

Steps to reproduce:

Launched Firefox 22.0 in a profile containing my Firefox extension, which in turn contains a dynamic library providing js-ctypes to my extension.  This extension+dylib loads with no problem if I launch Firefox 21 in this profile.


Actual results:

Error console says: 

Error: couldn’t open library: 
/Users/jk/Library/Application Support/Firefox/Profiles/liejtghrj.Me/extensions/ 
firefoxextension@sheepsystems.com/components/SSYFirefoxCTypes.dylib

The extension loads but, of course, the IPC services which my dylib provides don't work. 

If I quit Firefox 22.0  and launch Firefox 21 in this profile, it works fine.  Aurora 23.0a2 also fails.  An earlier version of my extension also fails with Firefox >= 22.0.  The problem is definitely due to a change in Firefox 22.0. 

I’ve read through the Firefox 22 Release Notes and Firefox 22 for Developers but didn’t find any news on js-ctypes or dynamic libraries.  I don’t see anything new in the ctpyes documentation either, 
https://developer.mozilla.org/en-US/docs/Mozilla/js-ctypes/js-ctypes_reference/ctypes 
This doc says it was last updated a year ago, mid-May 2012. 

Here is the code in my “overlay” .js file that is failing  The first three lines just get the path to my dylib… 

Components.utils.import(“resource://gre/modules/ctypes.jsm”) 
var profileDir = Components.classes["@mozilla.org/file/directory_service;1"].getService(Components.interfaces.nsIProperties).get(“ProfD”, Components.interfaces.nsIFile).path ; 
var cTypesDylibPath = profileDir + “/extensions/firefoxextension@sheepsystems.com/components/SSYFirefoxCTypes.dylib” ; 
var ssyFirefoxCTypes = ctypes.open(cTypesDylibPath) ; 

The first three lines are obviously OK because the path which "couldn't open" printed in the Error Console is correct. 

So it looks to me like a serious new bug in ctypes.open(). 


Expected results:

Should have been no error.
Comment 1 Josh Matthews [:jdm] (on vacation until Dec 5) 2013-05-21 09:12:55 PDT
Could you narrow down the regression range with the help of https://github.com/mozilla/mozregression?
Comment 2 Jerry Krinock 2013-05-21 14:55:15 PDT
I'm working on it, Josh.  Apparently mozregression "helpfully" creates a temporary profile and runs in the temporary profile, which, of course, does not have my extension in it.  I'll be back soon.
Comment 3 Jerry Krinock 2013-05-21 15:58:32 PDT
OK, I found the --profile option in mozregression.  Very nice little tool.

The answer is:

Last good nightly: 2013-03-17 Sunday
First bad nightly: 2013-03-18 Monday
Comment 4 Josh Matthews [:jdm] (on vacation until Dec 5) 2013-05-21 22:39:38 PDT
Thanks Jerry! Those dates give us a range like this: http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2013-03-17&enddate=2013-03-18
Comment 5 Josh Matthews [:jdm] (on vacation until Dec 5) 2013-05-21 22:54:10 PDT
Narrowed it to http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=0b052daa913c&tochange=b03bb3ce8cee based on the data from ftp.mozilla.org. I still don't see anything in the list that makes me suspicious; Jerry, would you mind testing these two nightlies by hand to confirm? ftp://ftp.mozilla.org/pub/firefox/nightly/2013/03/2013-03-17-03-09-23-mozilla-central/ should work fine, while ftp://ftp.mozilla.org/pub/firefox/nightly/2013/03/2013-03-18-03-09-47-mozilla-central/ should show the error.
Comment 6 Benjamin Smedberg [:bsmedberg] 2013-05-22 09:06:01 PDT
Does your .dylib link against any Mozilla libraries/symbols?

This seems like an obvious candidate: Bug 648407 - Fold NSPR, NSS and SQLite libraries all together on B2G, Android, OSX and Windows.

People using ctypes normally would not be linking their library against Mozilla libraries, but perhaps you were trying to use something from NSPR/NSS/sqlite?
Comment 7 Jerry Krinock 2013-05-22 11:41:12 PDT
(In reply to Josh Matthews [:jdm] from comment #5)
> would you mind testing these two nightlies by hand to confirm?

Confirmed.

The nightly 2013-03-17-03-09-23 is OK.
The nightly 2013-03-18-03-09-47 fails.

To doublecheck my test results, besides looking for that nastygram in Firefox' Error Console, I also check to see whether or not my port is open in Mac OS X, because that's the primary function of my dylib.

sudo launchctl bstree | grep sheepsystems

With the good nightly, that command finds my port.  With the bad nightly, it does not.
Comment 8 Jerry Krinock 2013-05-22 12:01:05 PDT
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #6)
> Does your .dylib link against any Mozilla libraries/symbols?

Yes.

> People using ctypes normally would not be linking their library against
> Mozilla libraries, but perhaps you were trying to use something from
> NSPR/NSS/sqlite?

I link against these…

libmozalloc.dylib
libnspr4.dylib
libxpcom.dylib
libxpcomglue_s.a

It looks like I may be linking against versions of these that I have had had hanging around since Firefox 7.  Also, until a couple years ago, the project target which builds this dylib was building a Binary XPCOM Component.  That is, I converted it from an old Binary XPCOM Component.
Comment 9 Jerry Krinock 2013-05-22 14:05:29 PDT
I've downloaded and examined the xulrunner frameworks for Mac found here…

http://ftp.mozilla.org/pub/mozilla.org/xulrunner/releases/

Libraries have definitely been disappearing…

                       Linked by  --- In xulrunner --
                       my .dylib  7.0   21.0   22.0b2
                       ---------  ---   ----   ------
libmozalloc.dylib         Y        Y      Y      Y    
libnspr4.dylib            Y        Y      Y      N
libxpcom.dylib            Y        Y      Y      N
libxpcomglue_s.a          Y        Y      N      N
-----------------------------------------------------
Can my dylib load?                 Y      Y      N

Yes, so this is pretty strong evidence that the removal of libnspr4 and libxpcom caused the problem.  Should I be linking to different libraries instead?
Comment 10 Josh Matthews [:jdm] (on vacation until Dec 5) 2013-05-22 14:43:52 PDT
Yes, they've all be merged into libxul now.
Comment 11 Jerry Krinock 2013-05-22 14:53:00 PDT
Thank you, Josh.  Is there a document describing what developers should do for Firefox 22?  The first boneheaded solution that comes to mind is to ship two dylibs in my extension: one linked for Firefox <= 21, the other linked for Firefox >= 22.  In my overlay .js file, try to load one, and if that fails try to load the other other.  Am I doing something out of the ordinary?
Comment 12 Jerry Krinock 2013-05-22 17:58:17 PDT
Short version: If instead of linking those four old Gecko dylibs into my dylib, I link in "XUL" from XUL.framework in xulrunner 22.0, it almost works.

Initially, I get two linking errors, citing these two undefined symbols…

nsACString::BeginReading() const
nsACString::operator()(nsID const&, void**) const

And then if I eliminate the following three lines of code from my source…

bytes = nsacString.BeginReading() ;
…
rv = NS_GetServiceManager(getter_AddRefs(servMan));
…
rv = servMan->GetServiceByContractID(
			SHEEP_SYS_JS_XPCOM_CONTRACT_ID, 
			NS_GET_IID(ISheepSysJSXPCOM),
			getter_AddRefs(ourJavascriptComponent)
			) ;

then my dylib builds, and when this modified extension loads into Firefox 22, it is able to load my dylib.  My shows its port as expected.  Of course, it doesn't quite work because those three lines of code are missing.

I've looked at the documentation for nsACString and nsQueryInterface and don't see any indications of depracation.

In reading Bug 648407, near the end, Mike Hommey says "Extension developers … using these libraries … through ctypes … need to adapt …"

I think he's talking about me?

How do I "adapt"?  Find a replacement for those three lines of code?  Link in another mozilla library which defines nsACString and nsQueryInterface?  More fundamentally, if I am doing something extraordinary, can I get back on the beaten path?
Comment 13 Benjamin Smedberg [:bsmedberg] 2013-05-23 09:18:22 PDT
The point of js-ctypes is that it's useful for system binaries or your own binaries that don't use XPCOM. I can't quite figure out why you'd use ctypes and then continue to use XPCOM: that defeats the point, since XPCOM is not binary-stable. Mike's comment was about extensions using NSPR or NSS, which are binary-stable APIs (although it's still not really recommended to use ctypes against them).
Comment 14 Jerry Krinock 2013-05-23 10:01:53 PDT
Thank you, Ben.

(In reply to Benjamin Smedberg  [:bsmedberg] from comment #13)
> I can't quite figure out why you'd use ctypes and then continue to use XPCOM: 
> that defeats the point, since XPCOM is not binary-stable.

Actually, there is good point, which is that when my dylib was a Binary XPCOM, starting with Firefox 5 or 6 or something, Firefox would refuse to load my binary unless it was compiled with the latest xulrunner, which meant, if you recall, that extensions were going to have to be updated every 6 weeks.  I was told that, to solve this problem, I should convert it to a dylib using js-ctypes, and that did solve the problem in the sense that I've since survived 15 Firefox versions without needing to update my extension.  What I read now is that you're telling me I didn't go far enough; that I also need to also go through my source code and eliminate calls to XPCOM.

> Mike's comment was about extensions using NSPR or NSS, which are binary-stable APIs (although it's still not really recommended to use ctypes against them).

OK, I understand.  It looks like maybe I am using NSPR (libnspr4.dylib?).

* * *

So, if I understand you correctly, the conclusion is that this bug is not really a bug, it's that I'm Doing It Wrong.  I need to see if I can avoid using XPCOM API in my dylib, and if I need help with that I'll ask on the Developer Forum.

One "bug" that could possibly be fixed is the error returned to JavaScript when a dynamic library cannot be loaded.  I think that Mac OS X tells what failed when a library won't load, so that instead of seeing in the Error Console that "Extension xxx couldn't load its dynamic library at path yyy", the user should instead see that "Extension xxx couldn't load its dynamic library at path yyy because the library libnspr4 could not be loaded."  As a bonus, "This version of Firefox does not support libnspr4."
Comment 15 Benjamin Smedberg [:bsmedberg] 2013-05-23 11:48:56 PDT
> What I read now is that you're telling me I didn't go far enough; that I
> also need to also go through my source code and eliminate calls to XPCOM.

Yes, mostly. The reason that we required people to recompile XPCOM binaries for each release is that we don't guarantee binary compatibility between releases any more, and so it's exceedingly likely that code using XPCOM will crash unless it is recompiled. It's rather amazing that you've survived this long; I'm guessing that you're only using your own interfaces and the component/service manager and not much else?

> So, if I understand you correctly, the conclusion is that this bug is not
> really a bug, it's that I'm Doing It Wrong.  I need to see if I can avoid
> using XPCOM API in my dylib, and if I need help with that I'll ask on the
> Developer Forum.

That is what I strongly suggest, yes. It might be possible to fix the remaining errors (string and the component manager) by changing a few more compile flags, but long-term it's going to be painful and unstable.

> One "bug" that could possibly be fixed is the error returned to JavaScript
> when a dynamic library cannot be loaded.  I think that Mac OS X tells what
> failed when a library won't load, so that instead of seeing in the Error
> Console that "Extension xxx couldn't load its dynamic library at path yyy",
> the user should instead see that "Extension xxx couldn't load its dynamic
> library at path yyy because the library libnspr4 could not be loaded."  As a
> bonus, "This version of Firefox does not support libnspr4."

I'd like this too, where possible. We should at least understand what the current JSCtypes code throws when it cannot load a library and improve the error messages. Please file this as a separate bug in Core::js-ctypes.
Comment 16 Jerry Krinock 2013-05-29 13:10:24 PDT
Thanks again, Benjamin.  The requested follow-up bug has been filed as Bug 877349.

Note You need to log in before you can comment on or make changes to this bug.