Closed Bug 217538 Opened 21 years ago Closed 16 years ago

softoken databases cannot be shared between multiple processes

Categories

(NSS :: Libraries, defect, P1)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: julien.pierre, Assigned: rrelyea)

References

(Blocks 1 open bug)

Details

Attachments

(20 files, 1 obsolete file)

625.23 KB, patch
Details | Diff | Splinter Review
312.04 KB, patch
Details | Diff | Splinter Review
159.70 KB, patch
Details | Diff | Splinter Review
152.75 KB, patch
Details | Diff | Splinter Review
466.69 KB, patch
Details | Diff | Splinter Review
93.62 KB, patch
Details | Diff | Splinter Review
168.82 KB, patch
Details | Diff | Splinter Review
468.70 KB, patch
Details | Diff | Splinter Review
51.65 KB, patch
Details | Diff | Splinter Review
8.00 KB, patch
Details | Diff | Splinter Review
22.05 KB, patch
Details | Diff | Splinter Review
6.17 KB, text/plain
Details
13.07 KB, patch
Details | Diff | Splinter Review
502 bytes, patch
rrelyea
: review+
Details | Diff | Splinter Review
469 bytes, patch
glenbeasley
: review+
Details | Diff | Splinter Review
22.65 KB, application/octet-stream
Details
4.80 KB, patch
glenbeasley
: review+
Details | Diff | Splinter Review
15.03 KB, patch
KaiE
: review+
julien.pierre
: superreview+
Details | Diff | Splinter Review
2.88 KB, patch
KaiE
: review+
Details | Diff | Splinter Review
816 bytes, patch
rrelyea
: review+
Details | Diff | Splinter Review
The Berkeley DB database code we are using in NSS by default is only safe if one
process is writing to the cert and key databases at a time. If multiple
applications start up, and all initialize NSS read/write, the databases will
become corrupt over time.

Fixing this requires replacing Berkeley DB with a better database engine, which
would talk to a local database server.
    rpm uses Berkeley DB too and while only one rpm instance can write at a time
any number of other can read. With the latest rpm from Fedora Core 3 the second
copy requiring write access will also wait first one to finish before writing.
Do mozilla/firefox do this too?
The dbm code now being used in mozilla for the cert and key databases 
safely permits only the following combinations of multi-process access:

1. Multiple processes using a DB, ALL of them using it read-only,
   NONE of them using it read-write, or

2. ONE process using the DB read-write, and NO other processes using it
   read-write or read-only.  

Since mozilla/FF/TB always open it read-write, only one process can 
safely have it open at one time.
As a further note, because of this restriction mozilla/FF/and TH each have their
own copies of the NSS certificate/key/module databases. This, of course, is not
desirable.

bob
The Berkeley DB used by NSS (in the source directory mozilla/dbm)
is Berkeley DB 1.85 with our modifications.  This is most likely
older than the Berkeley DB used by rpm.
(In reply to comment #4)
> The Berkeley DB used by NSS (in the source directory mozilla/dbm)
> is Berkeley DB 1.85 with our modifications.  This is most likely
> older than the Berkeley DB used by rpm.

Historic, if I get these numbers right:
[root@fedora ~]# rpm -q db4
db4-4.2.52-6

http://www.sleepycat.com/: Berkeley DB 4.3 available now! Download

1.96 is the latest 1.x and is unavailable for download.

From http://www.sleepycat.com/update/2.2.0/if.2.2.html:
| In DB 2.2.0, the handles returned by the DB subsystems, e.g., the db_appinit()
| function's DB_ENV, and the db_open() function's DB, are free-threaded. If you
| specify the DB_THREAD flag to each subsystem, or globally using db_appinit(),
| you can then use the returned handle concurrently in any number of threads.
... but seems incompatible and is not available for download too...
Blocks: 230884
Blocks: 135061
QA Contact: bishakhabanerjee → jason.m.reid
Assignee: wtchang → nobody
QA Contact: jason.m.reid → libraries
I believe Bob is planning to work on this in the near future.
Bob, please readjust priority, severity, target as you see fit.
Assignee: nobody → rrelyea
Severity: normal → major
Priority: -- → P2
Patches for the initial landing of the shared database code are forthcoming.

These patches are large, and a number of things are still missing, which I'll document as separate bugs. The design is documented at http://wiki.mozilla.org/NSS_Shared_DB

bob.
Target Milestone: --- → 3.12
This is a full copy of the core patch to softoken.
These are the removed files from the core patch.
Changed files in softoken.
New files in sofotken.
This the full legacydb directory.
This is the diff between all the files that were originally in softoken and their new counterparts in legacydb.
These are editted patches to show the difference between code extracted from files in softoken and their new reflections and independent files in legacydb
This is the same as Collection 2a except the diffs were not editted.
These are just the new files for the legacydb.
This attachement explains the different changes in each of the patches to aid reviews.
Attachment #266693 - Attachment is patch: true
Attachment #266693 - Attachment mime type: application/octet-stream → text/plain
Attachment #266701 - Attachment mime type: application/octet-stream → text/plain
There is one more patch... to add sqlite3. The patch was too large for mozilla since it has all of sqlite3. I'll break the patch up and post it tomorrow.

bob
Regarding comment 20, and the monster patch for sqlite, 
I think we aren't really going to review all that code.  
Maybe it would suffice to list the new source files, and to 
have a patch that includes only the new Makefile and manifest.mn files
(assuming you have them).
thanks nelson, I was thinking of something similar, perhaps just headers from the sqlite files.

I pretty much thought the same thing about review it.

bob
This patch includes the build and makefile changes. It includes the headers of the two sqlite3 files. (sqlite3.c and sqlite3.h)
This is not by any means an exhaustive review. I looked mostly at the core patches to see how the legacy DB code works in the new regime. Here are a few questions:

1) sftk_getDBForObject() - maybe sftk_getDBForTokenObject to make the requirement explicit.

2) ftkdb_fixupTemplateIn() - make the ULONG->network byte order code a function;could be optimized on different plaforms?

3) sftkdbLoad_Legacy() - the call through function pointer setCryptFunction, shouldn't that be (*setCryptFunction)(sftkdb_encrypt_stub,sftkdb_decrypt_stub)?

Looks pretty good. It also looks like you've done an awful lot of work, Bob. Wouldn't it be more conservative to keep the legacy code built in and access the new DB through the shared library interface?
Thanks Neil,

I have no problem changing the name of 'sftk_getDBForObject()' (though Adding token is a bit redundant since session objects never have DB's.

2) May be a macro. You could do the same optimization, but then you would have to detect the endianness of your platform. I believe we work pretty hard not to know that (though I may have reintroduced that knowledge in mpi at one point).

3) your formulation is more clear about what is happenning. I'll correct my  code.

As far as loading the legacy db. I had a couple of reasons for that.
first, there is a lot of code in softoken whose sole purpose is to service the
legacy database. You can see that in the size reduction of softoken itself, and the actual size of lgdbm.
second, all dependencies on libdbm are now pushed to libdbm.
Finally, in the normal operation (the case where we are using the new shared db), you never need to load the legacy db. You could even delete it (platforms that have never stored keys in legacy db's would never need to build or ship lgdbm.

bob
A request:  Please change the name "lgdbm" to something else.  
It's unpronouncable, and who knows what it means?  Legacy DBM? 
If that's it, let's call it "LegacyDB".  
We don't need to cram names into 8.3 any more.  
done, execept I made it all lower case as we have no mixed cased library names in NSS (including things like secutil, etc).

bob
Nelson, actually we do, shared librariy names on OS/2 have an 8.3 limit in the kernel :( But legacydb fits 8.3 .
There's already a sqlite in mozilla/db (which is being updated via bug 341137) -- is there a way avoid having this code being duplicated in two spots? It's rather large.
Attached patch build on aixSplinter Review
Attachment #268310 - Flags: review?(rrelyea)
OS/2 build failure.

g++  -Zomf -Zdll -Zmap  -o OS22.45_gcc_DBG.OBJ/sqlite3.DLL OS22.45_gcc_DBG.OBJ/s
qlite.def -o OS22.45_gcc_DBG.OBJ/sqlite3.DLL OS22.45_gcc_DBG.OBJ/sqlite3.o

E:\DEV\NSS\tip2\mozilla\security\nss\lib\sqlite\OS22.45_gcc_DBG.OBJ\sqlite3.o(E:
\DEV\NSS\tip2\mozilla\security\nss\lib\sqlite\sqlite3.c) : error LNK2029: "_DosL
oadModule" : unresolved external
E:\DEV\NSS\tip2\mozilla\security\nss\lib\sqlite\OS22.45_gcc_DBG.OBJ\sqlite3.o(E:
\DEV\NSS\tip2\mozilla\security\nss\lib\sqlite\sqlite3.c) : error LNK2029: "_DosF
reeModule" : unresolved external
E:\DEV\NSS\tip2\mozilla\security\nss\lib\sqlite\OS22.45_gcc_DBG.OBJ\sqlite3.o(E:
\DEV\NSS\tip2\mozilla\security\nss\lib\sqlite\sqlite3.c) : error LNK2029: "_DosQ
ueryProcAddr" : unresolved external


There were 3 errors detected
make.exe[2]: Leaving directory `E:/DEV/NSS/tip2/mozilla/security/nss/lib/sqlite'

make.exe[1]: Leaving directory `E:/DEV/NSS/tip2/mozilla/security/nss/lib'

Seems like we need to link one of the OS/2 system libraries, probably os2836.lib .
Justin RE: extra copy of sqlite.

So NSS is a stand alone component. It needs to build without the rest of mozilla included. There are basically 3 configurations for NSS in this case:

1) built with mozilla - NSS will use the mozilla copy of sqlite3.
2) built for platforms with OS provided sqlite3 (most modern Linux systems, for instance) - NSS will use the system version.
3) All other cases: NSS will use it's own.

Currently sqlite is not built on the tip of mozilla, so NSS will use it's own until sqlite gets turned on in mozilla.

bob
Comment on attachment 268310 [details] [diff] [review]
build on aix

r= relyea

Exeacly how I envisioned it, Glen has verified that if fixes the AIX build.
Attachment #268310 - Flags: review?(rrelyea) → review+
Julian,

It sounds like OS/2 needs a patch similiar to the one glen attached. The file is mozilla/security/nss/lib/sqlite/config.mk

thanks for testing OS/2!
Attachment #268368 - Flags: review?(glen.beasley)
Attachment #268368 - Flags: review?(glen.beasley) → review+
> Currently sqlite is not built on the tip of mozilla
That's not correct. If I'm not mistaken, all apps build MOZ_STORAGE, and mozila/storage builds mozilla/db/sqlite3/src.
Just look at your profile when using Firefox trunk. I'm counting 6 .sqlite files in there.
but where is sqlite3.dll? or is the mozilla sqlite linked into some other component library?
Attached file Memory leaks - logs.
Bob,

This code brought some memory leaks. I'm attaching logs with details. One-line versions of stacks are in file foundleaks, full stacks in other files. Please check all stack containing functions with lg_ prefix.
Bob,

Re: comment 32,

I was building NSS standalone, doing gmake nss_build_all . So there was no mozilla copy. And OS/2 doesn't come with sqlite. So we were in the third case.

Re: comment 34,
Yes, I agree. When I have time tonight I will attach a similar patch. That may fix the build of sqlite3.dll . But I expect to run into further issues with legacydb3.dll . I think the OS/2 linker is able to product DLLs with filenames longer than 8.3, but the kernel is not able to load such DLLs. Some optimization went into Warp 3 that removed support for long DLL names (!) and it's never worked since. legacydb3 is 9 characters. So we'll likely need to find something shorter, at least for OS/2. It would be easier if we had a short cross-platform name so the browser products don't have to special-case OS/2 everywhere in their packaging rules.
Attached patch Rest of the Leaks (obsolete) — Splinter Review
These are the rest of the leaks. These are all one-time startup/shutdown, but should be cleaned up when we leave. This patch should do that.
Attachment #268441 - Flags: review?(glen.beasley)
Julien,

If you need to shorten it, you can do so in the following places:

1) mozilla/security/nss/lib/softoken/legacydb/config.mk - change the shared library name for OS/2.
2) mozilla/security/nss/lib/softoken/lgglue.c - change the name of the library to load.

Options for the name change: 
1) drop the version
2) legacy3
3) legdb3
4) lgdb3

Obviously, I would prefer the shortened name be OS/2 only.

Nelson, is there a similar issue for WINCE?

bob

Thanks, Bob.

Dropping the version would be OK. Or how about calling it nssdbm3 ?
Comment on attachment 268441 [details] [diff] [review]
Rest of the Leaks

the changes all look good.
Attachment #268441 - Flags: review?(glen.beasley) → review+
(In reply to comment #37)
> but where is sqlite3.dll? or is the mozilla sqlite linked into some other
> component library?
It's statically linked into Firefox.exe (or whatever appname) like almost everything else:
http://mxr.mozilla.org/seamonkey/source/db/sqlite3/src/Makefile.in#49
Comment on attachment 268441 [details] [diff] [review]
Rest of the Leaks

This patch is fine, but it uncovered an existing bug. I'll attach a new patch that fixes both the leaks and the bug soon.
Attachment #268441 - Attachment is obsolete: true
This is pretty much the same patch I obsoleted, except it has a fix to the problem the previous patch uncovered (actually it was 2 problems which colided).
Attachment #268512 - Flags: review?
Attachment #268512 - Flags: review? → review?(glen.beasley)
Attachment #268512 - Flags: review?(glen.beasley) → review+
It is unacceptable to me to impose 8.3 file names on ALL platforms so that
all platforms can have the same name, even on an old OS that is no longer
produced, and for which commercial software products are no longer being
produced.  If you want to have 8.3 names on OS/2, go ahead, but don't 
restrict any other platforms with that.  

Bob,  WinCE works with long file names too.
Coolk, then Julien, I'd say make the change OS/2 specific (rather than 8.3 specific). nssdbm3 sounds fine for an OS/2 name.

bob
Nelson, the 8.3 restriction in OS/2 is specific to DLL filenames, not any other files.

Is there anything wrong with using the name nssdbm3 for all platforms ? I personally prefer it to legacydb3 .
I agree.  nssdbm3 is an improvement over legacydb3.  so, go ahead.
Julien, can you attack a patch for OS/2. I'd like to get alpha nailed today.

Thanks,

bob
rename legacy database shared library to make OS/2 happy
Attachment #268909 - Flags: review?(kengert)
Comment on attachment 268909 [details] [diff] [review]
rename legacy database to nssdbm

r=kengert
Attachment #268909 - Flags: review?(kengert) → review+
Attachment #268923 - Flags: review?(kengert)
Comment on attachment 268923 [details] [diff] [review]
turn on shared db in FF 3.0

r=kengert
Attachment #268923 - Flags: review?(kengert) → review+
Um, we certainly don't need a second copy of sqlite in the tree.  NSS should be building from mozilla/db instead of importing its own copy.  As for it being statically linked we should look at that, but checking in a second copy and linking to two versions seems like a big no no.  This should have been resolved before this was checked in imho.
(9:17:42 AM) timeless: you guys realize we intentionally don't want ours to be in global space
(9:17:49 AM) timeless: we're afraid of running into a system version that's incompatible
(9:18:13 AM) timeless: my guess is a pointer table will be the solution
(9:18:30 AM) timeless: so either someone calls SetSQLLitePointers(...) from psm
(9:18:42 AM) timeless: or they don't, and if they don't, nss does loadlibrary(sqllite)
(9:19:20 AM) timeless: kaie: oh, for kicks, iirc we're compiling sqlite w/o threadsafety
(9:19:29 AM) timeless: i hope you guys don't expect it :)
(9:19:35 AM) timeless: (and i'd expect you do...)

FYI
stuart:
NSS needs to build standalone without Firefox, in fact that is quickly becoming the standard way of getting NSS is to build it by itself. The NSS build of sqlite3 has the ability to turn on or off, and use something else. On some platforms that have sqlite3 built in, we will use the system version. For Mozilla builds, I intend to use the mozilla version, however it's currently statically linked with the FF binary, so that's not going to happen in the alpha. There's a very old standing bug about this (bug 306907) which hasn't gotten much attention by anyone (including me). It's definately something that needs to be looked into for Beta.
Timeless points about threads-safety as well (we can build NSS for non-threadsafe sqlite, but we need to do so at compile time).

Emphasizing Bob's point: recall that NSS is used by MANY product, and FireFox
is just one of them.  Most of those products do not build NSS from source
themselves, and instead take the NSS team's standard binary builds of shared
libraries.  Some do build from sources. The NSS team needs to deliver a 
complete set of working NSS shared libs to any of those products that take
binaries.  There also needs to be a single tag with which people can pull
the sources from the mozilla repository and then build and get a working
set of NSS shared libs (incuding SQLite).  NSS doesn't need its own SQLite
sources IFF there is a set of SQLite sources in the tree that NSS can use
as-is to get the SQLite shared library it needs.  If the sources to mozilla's 
current sqlite do not produce a shared lib that may be used with NSS's other 
shared libs, that's a non-starter.  

Finally we DO want a "thread safe" build of SQLite for most products.  
So the sources from which SQLite is built must be able to produce such a
build.  

Since NSS is now capable of building a SQLite shared lib, maybe mozilla 
should start to use that shared lib.  (Just a thought)
We will not ship two sqlites in Firefox 3. We should ship thread-safe sqlite, it will be needed soon enough if the Google Gears stuff takes off and makes headway in the WHAT-WG. The solution I see is to use a shared, thread-safe sqlite library. It should be built from mozilla/db/sqlite3, and that source should be at a version that all parties agree upon for any given Mozilla product release.

/be
Bob,

Re: comment 51
I will attach an OS/2 patch tonight to add the linking of the system library. Sorry, I forgot about over the weekend and didn't check my Sun email when I was off.
Attachment #269057 - Flags: review?(rrelyea)
Comment on attachment 268909 [details] [diff] [review]
rename legacy database to nssdbm

I confirmed that the tree was broken on OS/2 with the "legacydb3.dll" name. The loader was getting an error 123 which is ERROR_INVALID_NAME. The error went undetected in the build even though shlibsign failed. But none of the tools would run after the build due to this error.

After I updated my tree to pick up this patch and rebuilt with nssdbm3.dll, everything was fine.
Attachment #268909 - Flags: superreview+
Attachment #269057 - Flags: review?(rrelyea) → review+
Comment on attachment 269057 [details] [diff] [review]
Proper includes for OS/2 (checked in)

Thanks for the review, Bob.

Checking in sqlite3.c;
/cvsroot/mozilla/security/nss/lib/sqlite/sqlite3.c,v  <--  sqlite3.c
new revision: 1.4; previous revision: 1.3
done
Attachment #269057 - Attachment description: Proper includes for OS/2 → Proper includes for OS/2 (checked in)
Shouldn't those changes be pushed up into sqlite as well?
Shawn,

Yes, they should. I wish we only had one source tree for it for sustainability. I hope that issue gets taken care of soon.

I was under the impression sqlite was already part of the browser build on OS/2, so I am not sure why I had to make those changes to build on my OS/2 machine. It could be that the version of the OS/2 toolkit I am using is different than the one the Mozilla browser OS/2 maintainers are using. Regardless, the change I made to sqlite cannot hurt anyone.
By chance is there a sqlite Ticket number that we can watch for getting them pushed upstream?
Shawn, Julien, I will try to remember to add
# define INCL_DOSMODULEMGR
upstream in SQLite over the weekend. Then it should at least be in 3.4.1. I had never tested the amalgamation version, maybe that's why it always worked for me without that extra #define...
(In reply to comment #68)
> I will try to remember to add # define INCL_DOSMODULEMGR upstream in
> SQLite over the weekend.

Done that, but see Bug 385518. Three more lines will be necessary in the NSS version, too, if compiling/linking with OS/2 high memory support.
Depends on: 389568
Depends on: 391291
Depends on: 391292
Depends on: 391294
Depends on: 391296
Depends on: 392521
Priority: P2 → P1
No longer blocks: 135061
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: