PR_GetLibraryFilePathname is not working correctly on AIX

RESOLVED FIXED in 4.6

Status

NSPR
NSPR
--
major
RESOLVED FIXED
13 years ago
13 years ago

People

(Reporter: Philip K. Warren, Assigned: Wan-Teh Chang)

Tracking

({fixed1.7.6})

other
Other
AIX
fixed1.7.6

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments)

(Reporter)

Description

13 years ago
The NSPR function PR_GetLibraryFilePathname is not currently working properly on
AIX, and is causing the FIPS mode to be unable to start. The problem is that the
loadquery call is not returning all of the shared libraries loaded by Mozilla,
so the libsoftokn3.so library is not detected as being loaded by
PR_GetLibraryFilePathname and the libsoftokn3.chk file is not being found.

I have a patch which fixes this issue. I will post it shortly.
(Reporter)

Comment 1

13 years ago
Created attachment 173312 [details] [diff] [review]
Patch v1

This uses the L_IGNOREUNLOAD flag to loadquery, which is documented in the
ldr.h header file as follows: "Also get any per-process modules marked as
unloaded but are still in use".

I also changed the test to look at the span of the data segment of the loaded
library instead of the filename which should more accurately find the correct
library.

After applying this patch, I am able to Enable/Disable FIPS mode with no
exceptions, and the libfilename NSPR test continues to pass on AIX.
Attachment #173312 - Flags: review?(wtchang)
(Assignee)

Comment 2

13 years ago
Comment on attachment 173312 [details] [diff] [review]
Patch v1

Philip, I'm glad that you tracked this down.  I'm
sorry that I forgot to respond to your email last
week.

>-        if (loadquery(L_GETINFO, info, info_length) != -1) {
>+        if (loadquery(L_GETINFO | L_IGNOREUNLOAD, info, info_length) != -1) {

Just curious: what does it mean to be marked as
unloaded but still in use?

>+        unsigned long start = (unsigned long)infop->ldinfo_dataorg;
>+        unsigned long end = start + infop->ldinfo_datasize;
>+        if (start <= (unsigned long)addr && end > (unsigned long)addr) {

Since the function pointer 'addr' points to code,
not data, it seems that it should be in the text
segment, not the data segment.
(Reporter)

Comment 3

13 years ago
(In reply to comment #2)
> >-        if (loadquery(L_GETINFO, info, info_length) != -1) {
> >+        if (loadquery(L_GETINFO | L_IGNOREUNLOAD, info, info_length) != -1) {
> Just curious: what does it mean to be marked as
> unloaded but still in use?

I was curious about this one as well. I spoke with the linker/loader AIX
developers about this. Apparently every time a library is dlclose'd, there is a
flag set by the loader on the shared library. This happens even if it has been
dlopen'd twice, but dlclose'd once. It appears as hidden without specifying the
L_IGNOREUNLOAD flag to loadquery. This behavior really doesn't make much sense
to me, but this is the correct way to fix this problem according to the main AIX
developers.

> >+        unsigned long start = (unsigned long)infop->ldinfo_dataorg;
> >+        unsigned long end = start + infop->ldinfo_datasize;
> >+        if (start <= (unsigned long)addr && end > (unsigned long)addr) {
> 
> Since the function pointer 'addr' points to code,
> not data, it seems that it should be in the text
> segment, not the data segment.

In my testing, it always fell within the data segment, but if you prefer you can
keep the old code which used strstr if you feel that is a safer change. As an
alternative, we could always check if it fell within the data or text range.
(Reporter)

Comment 4

13 years ago
(In reply to comment #2)
> Since the function pointer 'addr' points to code,
> not data, it seems that it should be in the text
> segment, not the data segment.

Got an official answer to this question from one of the linker/loader experts at
IBM:

"Function pointers are data structures that contain 2 words: the text address,
and the TOC value for the enclosing module. Your test is correct."
(Assignee)

Comment 5

13 years ago
Comment on attachment 173312 [details] [diff] [review]
Patch v1

r=wtc.

If I remember correctly, when I wrote that code,
I tried to implement a test on 'addr' falling in
the text range and couldn't get it to work.  I
never thought to try the data range instead.

Could you ask the expert whether it is safe to
cast a function pointer to unsigned long on AIX.
I think it is.

I'm still curious as to who loads libsoftokn3.so
and unloads it, but that should not block the
checkin of your patch.	Thank you for fixing this
bug.
Attachment #173312 - Flags: review?(wtchang) → review+
(Reporter)

Comment 6

13 years ago
(In reply to comment #5)
> Could you ask the expert whether it is safe to
> cast a function pointer to unsigned long on AIX.
> I think it is.

Yes I have confirmed that this is legal. Should I check this patch in or will you?
(Assignee)

Comment 7

13 years ago
I checked in the patch on the NSPR trunk (NSPR 4.6)
and NSPRPUB_PRE_4_2_CLIENT_BRANCH (Mozilla 1.8 Beta).
Status: NEW → RESOLVED
Last Resolved: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → 4.6
(Reporter)

Comment 8

13 years ago
Comment on attachment 173312 [details] [diff] [review]
Patch v1

This is an important fix for AIX which allows the FIPS mode to work. It is low
risk and should not affect any other platforms.
Attachment #173312 - Flags: approval1.7.6?

Comment 9

13 years ago
Comment on attachment 173312 [details] [diff] [review]
Patch v1

a=mkaply for 1.7.6
Attachment #173312 - Flags: approval1.7.6? → approval1.7.6+
(Reporter)

Comment 10

13 years ago
Checked in on the Mozilla 1.7 branch:

Checking in prlink.c;
/cvsroot/mozilla/nsprpub/pr/src/linking/prlink.c,v  <--  prlink.c
new revision: 3.51.2.21.2.1; previous revision: 3.51.2.21
done
Keywords: fixed1.7.6

Comment 11

13 years ago
This patch caused a regression in our nightly builds of NSS and NSPR tip on AIX
. The FIPS tests started failing after the check in.

Jason, please follow up from here .

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Reporter)

Comment 12

13 years ago
(In reply to comment #11)
> This patch caused a regression in our nightly builds of NSS and NSPR tip on AIX
> . The FIPS tests started failing after the check in.

Can you be more specific on how it is failing? What level of AIX are you
reproducing this problem on? 32-bit or 64-bit build? Debug or optimized?

I ran the libfilename NSPR test before checking this in, and did not see any
failures. It also fixed the operation of enabling FIPS mode in the browser,
where it was not working before.

I will check out the latest version of NSS from CVS and run all of the tests again.
(Reporter)

Comment 13

13 years ago
I just checked out the latest NSPR, NSS from the Mozilla trunk and ran the
testsuite (all.sh from security/nss/tests) for both optimized and debug builds.
Looking at the results.html output, I see that both pass all of the tests,
including the FIPS ones. I will wait to hear from someone from Sun to get
specifics on the system where the FIPS tests are failing so I can try to
reproduce the problem on my end.

Comment 14

13 years ago
Philip,

It happened on a 64-bit AIX build, using the tip of both NSS and NSPR (no branch
tag).

From output.log:
cert.sh: Creating FIPS 140-1 DSA Certificates ==============
cert.sh: Initializing FIPS PUB 140-1 Test Certificate's Cert DB
--------------------------
certutil -N -d . -f ../tests.fipspw.483394
cert.sh: Enable FIPS mode on database -----------------------
modutil -dbdir . -fips true

WARNING: Performing this operation while the browser is running could cause
corruption of your security databases. If the browser is currently running,
you should exit browser before continuing this operation. Type
'q <enter>' to abort, or <enter> to continue:
Using database directory ....
An I/O error occurred during security authorization.
ERROR: Unable to switch FIPS modes.
cert.sh ERROR: Enable FIPS mode on database for FIPS PUB 140-1 Test Certificate
failed 11
cert.sh: Generate Certificate for FIPS PUB 140-1 Test Certificate
--------------------------
certutil -s "CN=FIPS PUB 140-1 Test Certificate, E=fips@bogus.com, O=BOGUS NSS,
OU=FIPS PUB 140-1, L=Mountain View, ST=California, C=US" -S -n
FIPS_PUB_140-1_Test_Certificate -x -t Cu,Cu,Cu -d . -f ../tests.fipspw.483394 -k
dsa -v 600 -m 500 -z ../tests_noise.483394
cert.sh SUCCESS: FIPS passed
cert.sh cert.sh: finished cert.sh
certutil -d ../fips -K -f ../tests.fipspw.483394
<0>
fips.sh: Attempt to list FIPS module keys with incorrect password
certutil -d ../fips -K -f
/share/builds/mccrel3/security/securitytip/builds/20050205.1/wozzeck_Solaris8/mozilla/tests_results/security/hbombaix.3/tests.fipsbadpw.483394
Incorrect password/PIN entered.
certutil: no keys found
certutil -K returned 255
(Reporter)

Comment 15

13 years ago
Julien - thanks for the information. I will investigate this fully tomorrow.
(Reporter)

Comment 16

13 years ago
Created attachment 173772 [details] [diff] [review]
Patch for 64-bit

On 64-bit, calling loadquery with the L_IGNOREUNLOAD flag causes an invalid
argument error. This is one way of fixing the problem, where we trap that
condition and remove the L_IGNOREUNLOAD flag from all subsequent calls to
loadquery. This leaves open the possibility of L_IGNOREUNLOAD support being
properly implemented in the future for 64-bit.

The alternative would just be the following:

#ifdef __64BIT__
  loadquery(L_GETINFO, ...)
#else
  loadquery(L_GETINFO | L_IGNOREUNLOAD, ...)
#endif
Attachment #173772 - Flags: review?(wtchang)
(Assignee)

Comment 17

13 years ago
Comment on attachment 173772 [details] [diff] [review]
Patch for 64-bit

r=wtc.

>+        /* Calling loadquery when compiled for 64-bit with the L_IGNOREUNLOAD
>+         * flag can cause an invalid argument error. Detect this error the
>+         * first time that loadquery is called, and try calling it again 
>+         * without this flag set. */

I'd like to note in the comment the version of 64-bit AIX
that we tested this patch on.
Attachment #173772 - Flags: review?(wtchang) → review+
(Assignee)

Comment 18

13 years ago
Philip, does this mean a 64-bit *Mozilla* won't be able
to enable FIPS mode?
(Reporter)

Comment 19

13 years ago
(In reply to comment #18)
> Philip, does this mean a 64-bit *Mozilla* won't be able
> to enable FIPS mode?

I will do a full 64-bit build of Mozilla to test this.
(Reporter)

Comment 20

13 years ago
I have tested a 64-bit build, and can confirm that FIPS mode will not currently
work on AIX as it is missing support for the L_IGNOREUNLOAD flag. I have
contacted the developers, and this will be fixed in AIX.

I will update this bug when I have more information on the fix and when it will
be available.
(Assignee)

Comment 21

13 years ago
I checked in the patch for 64-bit on the NSPR trunk
(NSPR 4.6) and NSPRPUB_PRE_4_2_CLIENT_BRANCH (Mozilla
1.8 Beta).
Status: REOPENED → RESOLVED
Last Resolved: 13 years ago13 years ago
Resolution: --- → FIXED
(Assignee)

Comment 22

13 years ago
Perhaps we should look harder to find out why
Mozilla loads and then unloads libsoftokn3.so.
Apparently NSS's FIPS tests don't do that;
otherwise they would fail on 64-bit AIX.
(Reporter)

Comment 23

13 years ago
Comment on attachment 173772 [details] [diff] [review]
Patch for 64-bit

This is an additional fix for 64-bit AIX that we would like to have on the
branch.
Attachment #173772 - Flags: approval1.7.6?

Comment 24

13 years ago
Comment on attachment 173772 [details] [diff] [review]
Patch for 64-bit

a=mkaply
Attachment #173772 - Flags: approval1.7.6? → approval1.7.6+
(Reporter)

Comment 25

13 years ago
64-bit workaround checked in on 1.7 branch:

Checking in prlink.c;
/cvsroot/mozilla/nsprpub/pr/src/linking/prlink.c,v  <--  prlink.c
new revision: 3.51.2.21.2.2; previous revision: 3.51.2.21.2.1
done
You need to log in before you can comment on or make changes to this bug.