Closed Bug 201807 Opened 22 years ago Closed 21 years ago

DBM: Security databases get corrupt in gcc OS/2 build

Categories

(NSS :: Libraries, defect, P1)

x86
OS/2
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: julien.pierre, Assigned: wtc)

References

Details

Attachments

(1 file)

I installed the gcc build of Mozilla 1.4alpha for OS/2 that was made available by Mike Kaply at ftp://ftp.mozilla.org/pub/mozilla/releases/mozilla1.4a/mozilla-os2-gcc-1.4a.zip . It corrupted my cert8.db and key3.db to the point that even if I went back to the VACPP build, security would no longer initialize. I believe there is a problem with DBM built under gcc for OS/2 that causes this database corruption. Fortunately I was able to restore all my certs & key from a P12 file. Other people may not be so fortunate. In the meantime, I recommend that this build be pulled because this is a very serious problem.
Priority: -- → P1
Target Milestone: --- → 3.8.1
This is a test build and we specifically tell people not to share profiles between VACPP and GCC. I'm not going to pull the build. We would sure appreciate help figuring out what is going on though.
define "corrupt" Do you have utilities that dump the DB files? Is it possible it is something as simple as incorret acking of structures?
Michael, I know you told people not to share profiles with the VACPP build. However, you also said that it would be OK to share if one deleted secmod.db . The reason for that being that the compilers use different calling conventions and therefore the PKCS#11 dynamically loadable root cert module, nssckbi.dll, will be mismatched and won't load or crash. However, I deleted my secmod.db prior to loading the new build in order to have a new secmod.db built with the correct root cert module. The issue here is that my cert8.db and/or key3.db got corrupt. I don't know exactly to what extent they got corrupt, so I can't define it in detail. However, the cert8.db and key3.db files are supposed to have a cross-platform stucture. I constantly move those files between win32, os/2, and solaris with no problem. The fact that the OS/2 gcc build won't work is a serious problem. It means that the command-line NSS tools built with gcc will also be incompatible with security DBs made on other platforms. IMO, if you aren't going to pull the gcc build because of this problem, you should at least change the instructions saying to delete secmod.db and share the VACPP profile, and add a bold warning that a full backup of the VACPP profile should be done before deleteing secmod.db and sharing with the gcc build, due to this corruption. Now, here is as much as I know about the corruption - I started with a working moz 1.3rtm build and cert8.db and key3.db full of certs and keys - I deleted secmod.db in my mozilla profile - I installed the gcc build - I added the vacpp profile to my gcc build - security would not initialize with that profile. It did initialize with a new profile - I went back to the vacpp build. It also could not initialize security with the profile in question. The reason is that the databases got corrupt, as evidenced by the fact that deleting them fixed the problem. There is no question in my mind the database problem exists in the gcc build. I recall seeing it months ago when I first tried to build NSPR & NSS with gcc, before IBM started their efforts. I didn't have time to look into it, though. FYI, I saw other problems unrelated to NSS in the gcc build, but database related : I could not read my local POP mail databases in the gcc build. Fortunately these didn't get corrupt, and I wasable to read them back when reverting to the VACPP build.
OK, so the question is if you just use the GCC build and don't touch VACPP at all, does security work? This issue probably is not gcc "corrupting" the databases - it's gcc writing them in a different format.
Mike, I haven't performed very extensive tests. With a new profile, the gcc build was able to connect to SSL sites, but I was prompted for every certificate. See bug 201808 . This means the PKI infrastructure isn't in place, and that's a rather serious problem too, which is why I marked it P1 also.
The GCC build is not complete if it can't share cert8/key3.db's. NSS carefully constructs the data it stores in these databases so that they are cross platform. This works on *ALL* other operating systems and compiliers. You can copy your MAC cert8/key3 .db's to your solaris or windows boxes without a problem. If this isn't working, the two areas to look at are 1) problems in constructing the underlying database format (that is problems in libdbm). or 2) problems in constructing the NSS specific data structures within the berkeley DB. These are completely constained in nss/lib/softoken. Given Julians' description, I would guess the problem lies in libdbm itself (libdbm is in mozilla/dbm). bob
dbm has a testcase (lots) that we just ran and it fails immediately on the GCC build. We are investigating.
Any help anyone could provide with what could go wrong with DBM would be much appreciated. We checked enum size and structure alignment (packing) and they were both OK.
Mike, I would start by looking at the build logs. There were tons of serious warnings building DBM on my OS/2 system with gcc. The C language lets you get away with a lot of bad things, unfortunately. Resolving these problems might help find the source of the corruption.
Attached patch patchSplinter Review
Found it. We are only opening the database in binary mode for VACPP. Need to do the same for GCC. With this change, the lots.exe test keeps on going (does it ever end?).
Great ! Glad you found it. I'm going to update my tree with this patch and see if I can get past the other problems.
Comment on attachment 120627 [details] [diff] [review] patch r=wtc. Javier, could you review the other OS2 ifdefs (especially the XP_OS2_VACPP ifdefs) and see if they are okay?
Attachment #120627 - Flags: review+
Blocks: 202042
I looked over the other OS2 defines and nothing stood out. I will play around with all of those when we start removing VACPP stuff from the tree.
I would like VACPP to keep working for NSPR & NSS for the time being. If you find something that's required only for VACPP but not for gcc, please leave the ifdef XP_OS2_VACPP . If it is required for both, it should be changed to ifdef XP_OS2 .
This patch improve things, but I'm not sure that this is the only thing that needs to be fixed for the DB on OS/2 yet. I have still seen odd errors initializing security even after applying it. Still trying to get a reproducible test case though.
Fix checked in. Leaving open per Julians comments. Although if lots runs, I'm convinced we're OK.
I don't know why, but I'm still getting occasional warnings about Mozilla not being able to initialize the security components even after applying the patch. I will examine the databases and try sharing them with other platforms. There is no question that the patch was an improvement but it may not be the full solution.
Somehow the cert8.db and secmod.db in my profile were reduced to a size of 0 bytes after I got the error message about the failure to initialize. I have no idea what caused that to happen. After I erased all 3 databases and copied key3.db and cert8.db from my Win32 installation, I was able to start mozilla / gcc and see my certs. So at least there is the expected level of DB portability. Let's see how long the DBs survive in the correct state now.
This bug has been fixed in Mozilla 1.4 Beta.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Summary: Security databases get corrupt in gcc OS/2 build → DBM: Security databases get corrupt in gcc OS/2 build
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: