Closed Bug 201807 Opened 21 years ago Closed 21 years ago

DBM: Security databases get corrupt in gcc OS/2 build

Categories

(NSS :: Libraries, defect, P1)

x86
OS/2
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: julien.pierre, Assigned: wtc)

References

Details

Attachments

(1 file)

I installed the gcc build of Mozilla 1.4alpha for OS/2 that was made available
by Mike Kaply at
ftp://ftp.mozilla.org/pub/mozilla/releases/mozilla1.4a/mozilla-os2-gcc-1.4a.zip .

It corrupted my cert8.db and key3.db to the point that even if I went back to
the VACPP build, security would no longer initialize.

I believe there is a problem with DBM built under gcc for OS/2 that causes this
database corruption.

Fortunately I was able to restore all my certs & key from a P12 file. Other
people may not be so fortunate.

In the meantime, I recommend that this build be pulled because this is a very
serious problem.
Priority: -- → P1
Target Milestone: --- → 3.8.1
This is a test build and we specifically tell people not to share profiles
between VACPP and GCC.

I'm not going to pull the build.

We would sure appreciate help figuring out what is going on though.
define "corrupt"

Do you have utilities that dump the DB files?

Is it possible it is something as simple as incorret acking of structures?
Michael,

I know you told people not to share profiles with the VACPP build. However, you
also said that it would be OK to share if one deleted secmod.db . The reason for
that being that the compilers use different calling conventions and therefore
the PKCS#11 dynamically loadable root cert module, nssckbi.dll, will be
mismatched and won't load or crash.

However, I deleted my secmod.db prior to loading the new build in order to have
a new secmod.db built with the correct root cert module.

The issue here is that my cert8.db and/or key3.db got corrupt.
I don't know exactly to what extent they got corrupt, so I can't define it in
detail.
However, the cert8.db and key3.db files are supposed to have a cross-platform
stucture. I constantly move those files between win32, os/2, and solaris with no
problem. The fact that the OS/2 gcc build won't work is a serious problem. It
means that the command-line NSS tools built with gcc will also be incompatible
with security DBs made on other platforms.

IMO, if you aren't going to pull the gcc build because of this problem, you
should at least change the instructions saying to delete secmod.db and share the
VACPP profile, and add a bold warning that a full backup of the VACPP profile
should be done before deleteing secmod.db and sharing with the gcc build, due to
this corruption.

Now, here is as much as I know about the corruption 
- I started with a working moz 1.3rtm build and cert8.db and key3.db full of
certs and keys

- I deleted secmod.db in my mozilla profile

- I installed the gcc build

- I added the vacpp profile to my gcc build

- security would not initialize with that profile. It did initialize with a new
profile

- I went back to the vacpp build. It also could not initialize security with the
profile in question. The reason is that the databases got corrupt, as evidenced
by the fact that deleting them fixed the problem.

There is no question in my mind the database problem exists in the gcc build. I
recall seeing it months ago when I first tried to build NSPR & NSS with gcc,
before IBM started their efforts. I didn't have time to look into it, though.

FYI, I saw other problems unrelated to NSS in the gcc build, but database
related : I could not read my local POP mail databases in the gcc build.
Fortunately these didn't get corrupt, and I wasable to read them back when
reverting to the VACPP build.
OK, so the question is if you just use the GCC build and don't touch VACPP at
all, does security work?

This issue probably is not gcc "corrupting" the databases - it's gcc writing
them in a different format.
Mike,

I haven't performed very extensive tests. With a new profile, the gcc build was
able to connect to SSL sites, but I was prompted for every certificate. See bug
201808 . This means the PKI infrastructure isn't in place, and that's a rather
serious problem too, which is why I marked it P1 also.
The GCC build is not complete if it can't share cert8/key3.db's. NSS carefully
constructs the data it stores in these databases so that they are cross
platform. This works on *ALL* other operating systems and compiliers. You can
copy your MAC cert8/key3 .db's to your solaris or windows boxes without a problem.

If this isn't working, the two areas to look at are 1) problems in constructing
the underlying database format (that is problems in libdbm). or 2) problems in
constructing the NSS specific data structures within the berkeley DB. These are
completely constained in nss/lib/softoken.

Given Julians' description, I would guess the problem lies in libdbm itself
(libdbm is in mozilla/dbm).

bob
dbm has a testcase (lots) that we just ran and it fails immediately on the GCC
build.

We are investigating.
Any help anyone could provide with what could go wrong with DBM would be much
appreciated.

We checked enum size and structure alignment (packing) and they were both OK.
Mike,

I would start by looking at the build logs. There were tons of serious warnings
building DBM on my OS/2 system with gcc. The C language lets you get away with a
lot of bad things, unfortunately. Resolving these problems might help find the
source of the corruption.
Attached patch patchSplinter Review
Found it.  We are only opening the database in binary mode for VACPP.  Need to
do the same for GCC.  With this change, the lots.exe test keeps on going (does
it ever end?).
Great ! Glad you found it. I'm going to update my tree with this patch and see
if I can get past the other problems.
Comment on attachment 120627 [details] [diff] [review]
patch

r=wtc. Javier, could you review the other OS2 ifdefs
(especially the XP_OS2_VACPP ifdefs) and see if they
are okay?
Attachment #120627 - Flags: review+
Blocks: 202042
I looked over the other OS2 defines and nothing stood out.  I will play around
with all of those when we start removing VACPP stuff from the tree.
I would like VACPP to keep working for NSPR & NSS for the time being.
If you find something that's required only for VACPP but not for gcc, please
leave the ifdef XP_OS2_VACPP . If it is required for both, it should be changed
to ifdef XP_OS2 .
This patch improve things, but I'm not sure that this is the only thing that
needs to be fixed for the DB on OS/2 yet. I have still seen odd errors
initializing security even after applying it. Still trying to get a reproducible
test case though.

Fix checked in.

Leaving open per Julians comments.

Although if lots runs, I'm convinced we're OK.
I don't know why, but I'm still getting occasional warnings about Mozilla not
being able to initialize the security components even after applying the patch.
I will examine the databases and try sharing them with other platforms.
There is no question that the patch was an improvement but it may not be the
full solution.
Somehow the cert8.db and secmod.db in my profile were reduced to a size of 0
bytes after I got the error message about the failure to initialize. I have no
idea what caused that to happen.

After I erased all 3 databases and copied key3.db and cert8.db from my Win32
installation, I was able to start mozilla / gcc and see my certs. So at least
there is the expected level of DB portability. Let's see how long the DBs
survive in the correct state now.
This bug has been fixed in Mozilla 1.4 Beta.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Summary: Security databases get corrupt in gcc OS/2 build → DBM: Security databases get corrupt in gcc OS/2 build
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: