Closed Bug 1034979 Opened 10 years ago Closed 10 years ago

WeaveCrypto unable to initialize on Mac with Brew's system NSS installed

Categories

(Core :: Security: PSM, defect)

33 Branch
x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla36

People

(Reporter: tgkokk, Assigned: hectorz)

References

Details

Attachments

(2 files, 1 obsolete file)

Attached file Initial log
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:33.0) Gecko/20100101 Firefox/33.0 (Beta/Release)
Build ID: 20140705030203

Steps to reproduce:

I set up sync (logged in with my account).


Actual results:

When trying to sync, I get an "unknown error" message, while in the initial sync log (before any sync happens) I get a message that a check has failed so Weave has been disabled.


Expected results:

Sync should work fine.
Component: Untriaged → Sync
Some crypto libraries where probably missing from my OS X. When I reinstalled it, the problem was fixed. Sorry for any inconvenience.
Status: UNCONFIRMED → RESOLVED
Closed: 10 years ago
Resolution: --- → INVALID
Actually, no, unfortunately, the problem isn't fixed. I sincerely apologize for any email updates.
Status: RESOLVED → UNCONFIRMED
Resolution: INVALID → ---
rnewman, any idea what's up here?
Flags: needinfo?(rnewman)
Are you running your own build, or a Nightly?

This error message implies that the Firefox crypto libs either are broken or aren't present. The usual reason for that is a user is on Linux and their system NSS is out of date, but on Mac the only obvious culprits would be a bad custom build, or some kind of dodgy kernel module.
Flags: needinfo?(rnewman) → needinfo?(t.kokkoris)
(In reply to Richard Newman [:rnewman] from comment #4)
> Are you running your own build, or a Nightly?
> 
> This error message implies that the Firefox crypto libs either are broken or
> aren't present. The usual reason for that is a user is on Linux and their
> system NSS is out of date, but on Mac the only obvious culprits would be a
> bad custom build, or some kind of dodgy kernel module.

I'm running Nightly (straight from Mozilla's servers), but the error is still there on stable.

I also got into Safe Mode (i.e. no kernel extensions and no application autostarting), started a new Firefox profile (stable), logged in but the problem persists. I checked on another user, just to be sure.

As I mentioned earlier, I did yesterday an OS X reinstall (the one where you don't lose your data), so I don't think I'm missing a crypto library (also, the most extreme thing I've ever done is delete xcrun (reinstalled again after that) and some of my /usr/local or my ~/Library (can't remember), nothing on /).
Flags: needinfo?(t.kokkoris)
Theodore: could you open a Browser Console and evaluate

  Weave.Svc.Crypto.nss.PK11_GenerateRandom(Weave.Svc.Crypto._getRandomByteBuffer(32), 32)

for me? If all is well, it should return 0.
Flags: needinfo?(t.kokkoris)
Summary: Weave gets disabled when setting up sync → WeaveCrypto unable to initialize on Mac 10.9
(In reply to Richard Newman [:rnewman] from comment #6)
> Theodore: could you open a Browser Console and evaluate
> 
>  
> Weave.Svc.Crypto.nss.PK11_GenerateRandom(Weave.Svc.Crypto.
> _getRandomByteBuffer(32), 32)
> 
> for me? If all is well, it should return 0.

It returns -1.
Flags: needinfo?(t.kokkoris)
-1 in this context is probably SECFailure returned from PK11_GetBestSlot, but understanding why would involve more context than I have available.

Brian, any insight?

In summary: the js-ctypes call into PK11_GenerateRandom is returning -1 on Theodore's Mac (10.9), regardless of Firefox version, and in Mac safe mode with a clean profile.

I can't reproduce on 10.8.
Flags: needinfo?(brian)
(In reply to Richard Newman [:rnewman] from comment #8)
> -1 in this context is probably SECFailure returned from PK11_GetBestSlot,
> but understanding why would involve more context than I have available.
> 
> Brian, any insight?
> 
> In summary: the js-ctypes call into PK11_GenerateRandom is returning -1 on
> Theodore's Mac (10.9), regardless of Firefox version, and in Mac safe mode
> with a clean profile.
> 
> I can't reproduce on 10.8.

When a function returns SECFailure it calls (is supposed to call) PR_SetError() with an error code describing the problem. After the call to PK11_GenerateRandom() that returns SECFailure, what is the value of PR_GetError()? What is the value of PR_ErrorToName(PR_GetRrror())?

My guess is that either the user has smartcards (or smartcard PKCS#11 modules) that are interfering, and/or the NSS isn't able to initialize with a read/write database, so it puts itself into read-only mode. A log generated by
NSPR_LOG_MODULES="pipnss:5" NSPR_LOG_FILE=mylog.txt would be useful.
Flags: needinfo?(brian)
Thanks, Brian.

OK, Theodore: let's start with the error code.

Try this:

Cu.import("resource://gre/modules/ctypes.jsm");
let path = ctypes.libraryName("nss3");
let nsslib = ctypes.open(path);
let PR_GetError = nsslib.declare("PR_GetError", ctypes.default_abi, ctypes.int);
let PR_ErrorToName = nsslib.declare("PR_ErrorToName", ctypes.default_abi, ctypes.unsigned_char.ptr, ctypes.int);

let nss = Weave.Svc.Crypto.nss;
let result = nss.PK11_GenerateRandom(Weave.Svc.Crypto._getRandomByteBuffer(32), 32)
if (result < 0) {
    console.log("Error: " + PR_ErrorToName(PR_GetError()).readString());
}

What does it print?
Flags: needinfo?(t.kokkoris)
(In reply to Theodore Kokkoris [:tgkokk] from comment #7)
> (In reply to Richard Newman [:rnewman] from comment #6)
> > Theodore: could you open a Browser Console and evaluate
> > 
> >  
> > Weave.Svc.Crypto.nss.PK11_GenerateRandom(Weave.Svc.Crypto.
> > _getRandomByteBuffer(32), 32)
> > 
> > for me? If all is well, it should return 0.
> 
> It returns -1.

I see this too on my Mac OS X 10.9.3 installation with the latest Nightly build, where the Weave log complains:

1404760876394	Sync.Service	INFO	Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:33.0) Gecko/20100101 Firefox/33.0
1404760876401	Sync.Service	DEBUG	Crypto check failed: [Exception... "PK11_GenrateRandom failed"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: resource://gre/modules/services-crypto/WeaveCrypto.js :: WeaveCrypto.prototype.generateRandomBytes :: line 550"  data: no]
1404760876402	Sync.Service	INFO	Could not load the Weave crypto component. Disabling Weave, since it will not work correctly.

According to my Android Nightly installation, my Mac Nightly last synced on June 1, which might have been around the time I upgraded from 10.9.2 to 10.9.3.


(In reply to Brian Smith (:briansmith, was :bsmith; NEEDINFO? for response) from comment #9)
> A log generated by
> NSPR_LOG_MODULES="pipnss:5" NSPR_LOG_FILE=mylog.txt would be useful.

I've restarted Nightly with these environment variables set, but calling Weave.Svc.Crypto.nss.PK11_GenerateRandom from the Browser Console doesn't write anything to the log.


(In reply to Richard Newman [:rnewman] from comment #10)
> Cu.import("resource://gre/modules/ctypes.jsm");
> …
> What does it print?

For me it prints "Error: cannot read contents of null pointer".
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(t.kokkoris)
(In reply to Myk Melez [:myk] [@mykmelez] from comment #11)
> I've restarted Nightly with these environment variables set, but calling
> Weave.Svc.Crypto.nss.PK11_GenerateRandom from the Browser Console doesn't
> write anything to the log.

Erm, I spoke too soon, as Nightly apparently did write this to the log, probably on shutdown, although, strangely, it was preceded by a bunch of NULL characters (which caused the file to be interpreted as binary):

2074120976[1003282d0]: receiving network teardown topic
2074120976[1003282d0]: in PSM code, receiving change-teardown
2074120976[1003282d0]: receiving profile change topic
2074120976[1003282d0]: nsNSSComponent::ShutdownNSS
2074120976[1003282d0]: evaporating psm resources
2074120976[1003282d0]: NSS SHUTDOWN FAILURE
2074120976[1003282d0]: nsNSSComponent: XPCom shutdown observed
2074120976[1003282d0]: nsNSSComponent::dtor
2074120976[1003282d0]: nsNSSComponent::ShutdownNSS
2074120976[1003282d0]: nsNSSComponent::dtor finished
(In reply to Myk Melez [:myk] [@mykmelez] from comment #11)

> For me it prints "Error: cannot read contents of null pointer".

That implies that there's no PR error being set. What about if you change

  PR_ErrorToName(PR_GetError()).readString()

to

  PR_GetError()

?
Glad to see I'm not the only one.

I ran that code snippet (with the above change), it prints: "Error: -B127".
(In reply to Theodore Kokkoris [:tgkokk] from comment #14)
> Glad to see I'm not the only one.
> 
> I ran that code snippet (with the above change), it prints: "Error: -B127".

Assuming you mean "-8127", not "-B127", Brian's comment 9 seems to be a typically accurate prediction:

  SEC_ERROR_NO_TOKEN 

	-8127 	The security card or token does not exist, needs to be initialized, or has been removed.


Do you have a smartcard reader or similar solution that might be interfering?


I'm flinging this bug over to PSM, because there's little evidence that it's something to do with Sync; Brian, please bounce it if there's a better component.
Component: Sync → Security: PSM
Product: Firefox → Core
(In reply to Richard Newman [:rnewman] from comment #15)
>   SEC_ERROR_NO_TOKEN 
> 
> 	-8127 	The security card or token does not exist, needs to be initialized,
> or has been removed.

I see this too: "Error: -8127".


> Do you have a smartcard reader or similar solution that might be interfering?

No, I don't a smart card reader nor anything like it (like a USB HSM).
Yeah, I mean "-8127".

No, I don't have a smartcard reader, but:

* Macbooks have a Thunderbolt port that might be interfering. For reference, I have a Macbook Air Mid-2013 (6,2).
* I have used a USB dongle for Vodafone Mobile Broadband in the past (which has a SIM card reader), but even after I uninstalled the application the error is the same (after all, it wouldn't make sense, since it works with a kernel extension).
* I do have a wireless mouse but even after removing it and booting in safe mode the problem still persists.

Could someone else with OS X >= 10.9.3 (I'm on 10.9.4, although I think this is only for Macbook Air) confirm this issue?
(In reply to Richard Newman [:rnewman] from comment #15)
> (In reply to Theodore Kokkoris [:tgkokk] from comment #14)
> > Glad to see I'm not the only one.
> > 
> > I ran that code snippet (with the above change), it prints: "Error: -B127".
> 
> Assuming you mean "-8127", not "-B127", Brian's comment 9 seems to be a
> typically accurate prediction:
> 
>   SEC_ERROR_NO_TOKEN 
> 
> 	-8127 	The security card or token does not exist, needs to be initialized,
> or has been removed.

This may mean that WeaveCrypto is using NSS without initializing PSM. Before using any NSS function, you must ensure that do_GetService("something-something-psm;1") has been called and that that call succeeds.

It may also mean that the user has a master password and you are trying to use NSS before the master password has been given by the user. <Hands waving, as I've already forgotten how the master password prompting and whatnot works.>
FWIW, I can't reproduce on my MBP (15", 2013), at least to the extent that:

let nss = Weave.Svc.Crypto.nss;
nss.PK11_GenerateRandom(Weave.Svc.Crypto._getRandomByteBuffer(32), 32)

returns 0.

Note that on my machine, at least, there is no libnss3.dylib, so the ctypes bits don't work...
(I'm on 10.9.4, as far as mac OS version numbers might be concerned...)
(In reply to Brian Smith (:briansmith, was :bsmith; NEEDINFO? for response) from comment #18)
> This may mean that WeaveCrypto is using NSS without initializing PSM. Before
> using any NSS function, you must ensure that
> do_GetService("something-something-psm;1") has been called and that that
> call succeeds.

The JavaScript equivalent of do_GetService("something-something-psm;1") succeeds when I evaluate it in Browser Console, but PK11_GenerateRandom(…) still fails afterward:

  > Components.classes["@mozilla.org/psm;1"].getService()
  < XPCWrappedNative_NoHelper { QueryInterface: QueryInterface() }
  > Components.classes["@mozilla.org/psm;1"].getService() && Weave.Svc.Crypto.nss.PK11_GenerateRandom(Weave.Svc.Crypto._getRandomByteBuffer(32), 32)
  < -1


> It may also mean that the user has a master password and you are trying to
> use NSS before the master password has been given by the user. <Hands
> waving, as I've already forgotten how the master password prompting and
> whatnot works.>

I don't have a master password, among other reasons because the new sync implementation doesn't sync passwords if you have a master password <https://support.mozilla.org/en-US/questions/998080>.
(In reply to Brian Smith (:briansmith, was :bsmith; NEEDINFO? for response) from comment #18)

> This may mean that WeaveCrypto is using NSS without initializing PSM. Before
> using any NSS function, you must ensure that
> do_GetService("something-something-psm;1") has been called and that that
> call succeeds.

We have this code block immediately before the ctypes stuff is done:

    initNSS : function() {
        // We use NSS for the crypto ops, which needs to be initialized before
        // use. By convention, PSM is required to be the module that
        // initializes NSS. So, make sure PSM is initialized in order to
        // implicitly initialize NSS.
        Cc["@mozilla.org/psm;1"].getService(Ci.nsISupports);


Is that enough?

 
> It may also mean that the user has a master password and you are trying to
> use NSS before the master password has been given by the user. <Hands
> waving, as I've already forgotten how the master password prompting and
> whatnot works.>

I'm pretty sure it's not that -- we have a separate, much later, code path that makes sure MP is unlocked by attempting to retrieve a credential.
Yes, it looks like you are doing things correctly. I am not sure what the issue is, then.
As for the smart card reader issue, most Macbooks (including mine) have an SDXC card reader on the right hand side. I don't know whether that's is related to this bug.
(In reply to Theodore Kokkoris [:tgkokk] from comment #24)
> As for the smart card reader issue, most Macbooks (including mine) have an
> SDXC card reader on the right hand side. I don't know whether that's is
> related to this bug.

It shouldn't be -- that's been standard issue for several years.

Me:     MBP Mid 2012, 10.8.5, works
Gijs:   MBP Mid 2013, 10.9.4, works
Myk:    MBP ???,      10.9.3, fails
tgkokk: MBA Mid 2013, 10.9.4, fails

Is that an accurate table, folks?
(In reply to Richard Newman [:rnewman] from comment #25)

> Me:     MBP Mid 2012, 10.8.5, works
> Gijs:   MBP Mid 2013, 10.9.4, works
> Myk:    MBP ???,      10.9.3, fails
> tgkokk: MBA Mid 2013, 10.9.4, fails
> 
> Is that an accurate table, folks?

Mine, like yours, is an MBP Mid 2012, making the table:

Richard: MBP Mid 2012, 10.8.5, works
Gijs:    MBP Mid 2013, 10.9.4, works
Myk:     MBP Mid 2012, 10.9.3, fails
tgkokk:  MBA Mid 2013, 10.9.4, fails

So it looks like the problem is unrelated to the model (pro vs. air), model version (Mid 2012 vs. Mid 2013), and OS version (10.8.5 vs. 10.9.3 vs. 10.9.4).
Just to note, in Bug 1046450 this error showed up in FF32 for a user on the beta channel, rather than nightly.
Adding a tracking note here after IRL discussion: Chris, can you please try the Browser-Console snippets above and see if this triggers anything on your Nightly that was having issues in Bug 1046450 Comment 7?
Flags: needinfo?(ckarlof)
(Your traceback errors out in nsKeyObjectFactory::KeyFromString, which does call some related nss.PK11_* functions, so it may be the same underlying bug)
I chatted with David Keeler about our options here, and he didn't see anything obvious.  Ideally we will find a machine this reproduces on, and single-step through the process to find out what goes wrong.  So I'm hopeful Chris's machine will reproduce, but otherwise, we may need to find someone who can make some time with Myk to work through it.
|brew uninstall nss| and restart Fx seems to fix the problem for me.
(In reply to Hector Zhao [:hectorz] from comment #33)
> |brew uninstall nss| and restart Fx seems to fix the problem for me.

Oh, that's very interesting... Myk, can you check if you have an external NSS library available using brew or macports or such?
Flags: needinfo?(myk)
(In reply to Hector Zhao [:hectorz] from comment #33)
> |brew uninstall nss| and restart Fx seems to fix the problem for me.

Thanks a lot, this fixes the problem for me too.
`brew uninstall nss` fixes the problem for me as well.
(while I'm happy that this problem is being resolved, I think we should probably still figure out why the brew install of NSS is interfering with the Firefox copy of NSS...)
Flags: needinfo?(ckarlof)
(In reply to :Gijs Kruitbosch (intermittently here 14-15 August; then away until 19th) from comment #37)
> (while I'm happy that this problem is being resolved, I think we should
> probably still figure out why the brew install of NSS is interfering with
> the Firefox copy of NSS...)

in |WeaveCrypto.initNSS|:

// Open the NSS library.
let path = ctypes.libraryName("nss3");

// XXX really want to be able to pass specific dlopen flags here.
var nsslib;
try {
    this.log("Trying NSS library without path");
    nsslib = ctypes.open(path);
} catch(e) {
    // In case opening the library without a full path fails,
    // try again with a full path.
    ......
}

In my test, calling |ctypes.open(ctypes.libraryName("nss3"))| instead of the full path will get the libnss3.dylib installed by brew on OSX, but the libnss3.so in unzipped nightly directory on Linux. brew and Linux distributes nss version 3.16.2 Basic ECC, while Nightly includes 3.17 Basic ECC Beta, the version can be seen by running:

let NSS_GetVersion = nss.declare("NSS_GetVersion", ctypes.default_abi, ctypes.char.ptr);
console.log(NSS_GetVersion().readString());

So might be a js-ctypes bug?
(In reply to :Gijs Kruitbosch (intermittently here 14-15 August; then away until 19th) from comment #34)
> Oh, that's very interesting... Myk, can you check if you have an external
> NSS library available using brew or macports or such?

Yes, I had one installed, and uninstalling it works around the problem.


(In reply to :Gijs Kruitbosch (intermittently here 14-15 August; then away until 19th) from comment #37)
> (while I'm happy that this problem is being resolved, I think we should
> probably still figure out why the brew install of NSS is interfering with
> the Firefox copy of NSS...)

Indeed.  I can't remember why I had it installed, but presumably there's a reason, as there is for others, and having it installed shouldn't break sync.
Flags: needinfo?(myk)
Hi, I don't have nss installed via brew at all and I suffer from the same problem, on MBP 10.9.5.

I tried deleting my profile and safe mode, but the issue persists.

Running the snippet above returns "ReferenceError: Weave is not defined".
Attached patch Patch (obsolete) — Splinter Review
IIUC, "Trying NSS library without path" was introduced in Bug 583209 for Fx built with "--with-system-nss" (mostly by Linux distributions). Not really sure about Android, but this patch at least works for me on OS X.

Try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=1ce0f6d73a43
Attachment #8504078 - Flags: feedback?(rnewman)
Comment on attachment 8504078 [details] [diff] [review]
Patch

Review of attachment 8504078 [details] [diff] [review]:
-----------------------------------------------------------------

I'm happy with this logic and with the patch.

There's no need to worry about Android; it doesn't use WeaveCrypto.

My only concerns are:

* What does this do to IceWeasel et al? Bug 583209 is full of alarming commentary.
* Is this going to silently break some distribution channel we haven't thought of?

Flagging dolske and Brian for a second and third opinion. Please redirect as appropriate.
Attachment #8504078 - Flags: superreview?(dolske)
Attachment #8504078 - Flags: review+
Attachment #8504078 - Flags: feedback?(rnewman)
Attachment #8504078 - Flags: feedback?(brian)
(In reply to Theodore Kokkoris [:tgkokk] from comment #24)
> As for the smart card reader issue, most Macbooks (including mine) have an
> SDXC card reader on the right hand side. I don't know whether that's is
> related to this bug.

That's just for "SD Card" flash storage, which is a completely different technology from smartcards. Unrelated to NSS/PSM and the bugs here.
Comment on attachment 8504078 [details] [diff] [review]
Patch

Review of attachment 8504078 [details] [diff] [review]:
-----------------------------------------------------------------

I think this change is okish, but I'm a little confused as to what it's fixing.

Sounds like the bug here is that on systems with brew-installed NSS, we were preferentially picking up the brew-NSS -- which then fails to work because it wasn't initialized. So I'm not really sure how this patch is helping. Shouldn't it still pick up the brew-NSS and fail the same way? Or fail a little more explosively by failing to initialize |nsslib| at all? (I'm not sure if we were picking up the wrong NSS in the try or the catch, which is the difference between the two.)

FWIW, the "XXX" above was about wanting to use the dlopen flag that lets us restrict the library to the one already mapped in the process, which helps avoid the whole "open the wrong library and break when trying to use it." That would be the ideal fix, but probably out of scope here.
We were picking up the system NSS, which (a) wasn't initialized, and (b) isn't the one we expect to be using. E.g., we hit this once before in 2010/2011 when the system NSS didn't have the right stuff for us to do J-PAKE.

This patch essentially asserts that we never want to use the system NSS when running a Firefox that was built to use its own NSS, and we always want to use the system NSS when we were built to use the system one.

We used a try-catch for that rather than preprocessing for -- as far as I can tell -- no good reason (do you remember why?).

The ordering of the clauses will always be wrong for someone, and I think it's only worked for this long because most users don't have a system NSS!
(In reply to Richard Newman [:rnewman] from comment #46)

> This patch essentially asserts that we never want to use the system NSS when
> running a Firefox that was built to use its own NSS, and we always want to
> use the system NSS when we were built to use the system one.

Right, that part I'm fine with. I just didn't understand what path the current code was taking and if this fixes something or is just a cleanup. Unless there's a case when the try was succeeding but we wanted to catch to work?

Wait, is MOZ_NATIVE_NSS what gets enabled with --with-system-nss? It is. Ugh, NATIVE is ambiguous. Why does our configure goop turn --with-system-foo into MOZ_NATIVE_FOO? Sigh. Ok.
Attachment #8504078 - Flags: superreview?(dolske) → superreview+
Summary: WeaveCrypto unable to initialize on Mac 10.9 → WeaveCrypto unable to initialize on Mac with Brew's system NSS installed
I'd just like to reiterate that this is happening on my system where I have NOT brew nss installed at all. The title seemed more accurate before.
Comment on attachment 8504078 [details] [diff] [review]
Patch

Review of attachment 8504078 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/crypto/modules/WeaveCrypto.js
@@ +131,5 @@
>          let path = ctypes.libraryName("nss3");
>  
>          // XXX really want to be able to pass specific dlopen flags here.
>          var nsslib;
> +#ifdef MOZ_NATIVE_NSS

I suggest that you add a comment explaining what is going on here, like "Search LD_LIBRARY_PATH for NSS when using system NSS."

HOWEVER, don't we generally discourage these kinds of path-less shared library loads? It isn't clear to me that this is safe and always going to load the correct version of the NSS libraries (the same ones already loaded into the Gecko process).

IIRC, NSPR has a function that returns a handle to the shared library that would be returned during normal dynamic linking for a given symbol such as "NSS_Init" or "NSS_Initialize", called PR_FindSymbolAndLibrary. I am not sure exactly how it works, but it seems like the best solution is to have jsctypes use *that* function, and handle this automatically for you:

/*
** Finds a symbol in one of the currently loaded libraries. Given the
** name of a procedure, return the address of the function that
** implements the procedure, and return the library that contains that
** symbol, or NULL if no such function can be found. This does not find
** symbols in the main program (the ".exe"); use PR_AddStaticLibrary to
** register symbols in the main program.  
**
** This increments the reference count of the library.
*/
NSPR_API(void*) PR_FindSymbolAndLibrary(const char *name,
						      PRLibrary* *lib);

/*
** Similar to PR_FindSymbolAndLibrary, except that the return value is
** a pointer to a function, and not a pointer to void. Casting between a
** data pointer and a function pointer is not portable according to the C
** standard. Any function pointer can be cast to any other function pointer.
**
** This increments the reference count of the library.
*/
NSPR_API(PRFuncPtr) PR_FindFunctionSymbolAndLibrary(const char *name,
						      PRLibrary* *lib);

@@ +135,5 @@
> +#ifdef MOZ_NATIVE_NSS
> +        this.log("Trying NSS library without path");
> +        nsslib = ctypes.open(path);
> +#else
> +        let file = Services.dirsvc.get("GreBinD", Ci.nsILocalFile);

Again, this seems error-prone.

How do you know we really loaded NSS/NSPR out of GreBinD? That is the normal place they live, but it seems like a stretch to assume that things are always normal. Assuming the NSPR mechanisms work, they seem superior.
Attachment #8504078 - Flags: feedback?(brian)
(In reply to Vittorio Giovara from comment #41)
> Hi, I don't have nss installed via brew at all and I suffer from the same
> problem, on MBP 10.9.5.
> 
> I tried deleting my profile and safe mode, but the issue persists.
> 
> Running the snippet above returns "ReferenceError: Weave is not defined".

Where did you run this snippet? It sounds like you used the web console instead of the browser console (or a web-privileged scratchpad instead of a browser-privileged scratchpad).
Flags: needinfo?(vitto.giova)
(In reply to :Gijs Kruitbosch from comment #50)
> (In reply to Vittorio Giovara from comment #41)
> > Hi, I don't have nss installed via brew at all and I suffer from the same
> > problem, on MBP 10.9.5.
> > 
> > I tried deleting my profile and safe mode, but the issue persists.
> > 
> > Running the snippet above returns "ReferenceError: Weave is not defined".
> 
> Where did you run this snippet? It sounds like you used the web console
> instead of the browser console (or a web-privileged scratchpad instead of a
> browser-privileged scratchpad).

Umh, I probably ran it in the WebConsole, as I did not have input enabled in the BrowserConsole. I read the docs, enabled it and re-run the snippet and it correctly returns 0. Now I am not sure what the right bug is for me though :/
Flags: needinfo?(vitto.giova)
(In reply to Vittorio Giovara from comment #51)
> Umh, I probably ran it in the WebConsole, as I did not have input enabled in
> the BrowserConsole. I read the docs, enabled it and re-run the snippet and
> it correctly returns 0. Now I am not sure what the right bug is for me
> though :/

Please file a new bug. :-)
(In reply to Justin Dolske [:Dolske] from comment #45)
>
> ... more explosively by failing to initialize |nsslib| at all? (I'm not
> sure if we were picking up the wrong NSS in the try or the catch, which is
> the difference between the two.)
In the try, ctypes.open/PR_LoadLibraryWithFlags.
Only the search order on Windows for path-less loads is documented at MDN.
> 
> FWIW, the "XXX" above was about wanting to use the dlopen flag that lets us
> restrict the library to the one already mapped in the process, which helps
> avoid the whole "open the wrong library and break when trying to use it."
> That would be the ideal fix, but probably out of scope here.
Bug 593484 ?

(In reply to Justin Dolske [:Dolske] from comment #47)
> Unless there's a case when the try was succeeding but we wanted to catch to
> work?
See above.

(In reply to Brian Smith (:briansmith, :bsmith, use NEEDINFO?) from comment #49)
> 
> ::: services/crypto/modules/WeaveCrypto.js
> @@ +131,5 @@
> >          let path = ctypes.libraryName("nss3");
> >  
> >          // XXX really want to be able to pass specific dlopen flags here.
> >          var nsslib;
> > +#ifdef MOZ_NATIVE_NSS
> 
> I suggest that you add a comment explaining what is going on here, like
> "Search LD_LIBRARY_PATH for NSS when using system NSS."
> 
> HOWEVER, don't we generally discourage these kinds of path-less shared
> library loads? It isn't clear to me that this is safe and always going to
> load the correct version of the NSS libraries (the same ones already loaded
> into the Gecko process).
Is path-less loads discouraged in general or just for those libraries packaged with Fx?
> 
> IIRC, NSPR has a function that returns a handle to the shared library that
> would be returned during normal dynamic linking for a given symbol such as
> "NSS_Init" or "NSS_Initialize", called PR_FindSymbolAndLibrary. I am not
> sure exactly how it works, but it seems like the best solution is to have
> jsctypes use *that* function, and handle this automatically for you:
I think using js-ctypes to access NSPR may have similar problems, and the optional folding of NSPR into NSS requires more manipulation. Maybe nsINSSVersion or OS.Constants.Path could be enhanced to expose these paths?
> 
> @@ +135,5 @@
> > +#ifdef MOZ_NATIVE_NSS
> > +        this.log("Trying NSS library without path");
> > +        nsslib = ctypes.open(path);
> > +#else
> > +        let file = Services.dirsvc.get("GreBinD", Ci.nsILocalFile);
> 
> Again, this seems error-prone.
> 
> How do you know we really loaded NSS/NSPR out of GreBinD? That is the normal
> place they live, but it seems like a stretch to assume that things are
> always normal. Assuming the NSPR mechanisms work, they seem superior.
No, I don't really understand how Fx loads NSPR/NSS.
(In reply to Hector Zhao [:hectorz] from comment #53)
> > FWIW, the "XXX" above was about wanting to use the dlopen flag that lets us
> > restrict the library to the one already mapped in the process, which helps
> > avoid the whole "open the wrong library and break when trying to use it."
> > That would be the ideal fix, but probably out of scope here.
> Bug 593484 ?

My suggestion in my previous comment is basically to fix bug 593484.

> Is path-less loads discouraged in general or just for those libraries
> packaged with Fx?

I would talk to dveditz's team about what they consider safe and unsafe in this respect, on which platforms.

> I think using js-ctypes to access NSPR may have similar problems, and the
> optional folding of NSPR into NSS requires more manipulation. Maybe
> nsINSSVersion or OS.Constants.Path could be enhanced to expose these paths?

Right, it doesn't make sense to use the same questionable method to find NSPR so that you can use the NSPR functions to correctly load stuff from NSS. Again, my suggestion was/is to fix bug 593484 and then use that mechanism.
(In reply to Brian Smith (:briansmith, :bsmith, use NEEDINFO?) from comment #54)
> (In reply to Hector Zhao [:hectorz] from comment #53)
> > > FWIW, the "XXX" above was about wanting to use the dlopen flag that lets us
> > > restrict the library to the one already mapped in the process, which helps
> > > avoid the whole "open the wrong library and break when trying to use it."
> > > That would be the ideal fix, but probably out of scope here.
> > Bug 593484 ?
> 
> My suggestion in my previous comment is basically to fix bug 593484.

We should, but I think the patch here is an incremental improvement from what we have today.
Patch updated with a comment "Search platform-dependent library paths for system NSS", as LD_LIBRARY_PATH is Linux-only.

Carrying-over rnewman's r+ and dolske's sr+.
Attachment #8504078 - Attachment is obsolete: true
Attachment #8505442 - Flags: superreview+
Attachment #8505442 - Flags: review+
Request checkin per comment 55.
Keywords: checkin-needed
Please provide a Try link when requesting checkin.
Assignee: nobody → bzhao
Keywords: checkin-needed
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #58)
> Please provide a Try link when requesting checkin.

In comment 42: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=1ce0f6d73a43

Is it enough or should I run another one with additional tests?
(In reply to Hector Zhao [:hectorz] from comment #59)
> (In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #58)
> > Please provide a Try link when requesting checkin.
> 
> In comment 42:
> https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=1ce0f6d73a43
> 
> Is it enough or should I run another one with additional tests?
Flags: needinfo?(sheriffs)
(In reply to Hector Zhao [:hectorz] from comment #60)
> (In reply to Hector Zhao [:hectorz] from comment #59)
> > (In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #58)
> > > Please provide a Try link when requesting checkin.
> > 
> > In comment 42:
> > https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=1ce0f6d73a43
> > 
> > Is it enough or should I run another one with additional tests?

oh this should be ok, thanks!
Flags: needinfo?(sheriffs)
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/6c5707f6f64b
Status: NEW → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: