6.22 KB, text/plain
4.13 KB, patch
|Details | Diff | Splinter Review|
8.95 KB, patch
|Details | Diff | Splinter Review|
8.08 KB, patch
|Details | Diff | Splinter Review|
Directoryserver ran out of file descriptors on HPUX 11i. The problem went away after they recompiled all their components with this flag. Nobody ever narrowed down the problem to NSPR, but it was suspected that NSPR, DBM and NSS need this flag. Sun's version of NSPR 4.1.3 was build with this flag set From the HP-UX Release Notes: "This feature is optional because it will not be often used and to minimize the impact on existing code. An application requiring more than 2048 file descriptors must be recompiled with the new symbol _USE_BIG_FDS defined. To do this, add the flag D_USE_BIG_FDS to the compile command in the application's makefile. Or, the symbol can be defined at the beginning of every application source file (via #define _USE_BIG_FDS)...." Index: config/HP-UX.mk =================================================================== RCS file: /cvsroot/mozilla/nsprpub/config/HP-UX.mk,v retrieving revision 188.8.131.52 diff -r184.108.40.206 HP-UX.mk 78a79,82 > ifeq ($(OS_RELEASE),B.11.11) #sonmi > OS_CFLAGS += -D_USE_BIG_FDS > endif > see also bug #184514 I have been unable to get more information about the problem
I seem to have found a place in PR_Poll where it calls the FD_SET macro before checking that the fd is less than FD_SETSIZE. This would cause memory corruption because NSPR does not increase FD_SETSIZE to a larger value, which is another bug. Could you ask the NSPR customer if it was PR_Poll that was having the problem?
Created attachment 111512 [details] [diff] [review] Initial patch -D_USE_BIG_FDS will increase the stack memory usage of PR_Poll by about 21KB. Since NSPR's default thread stack is 64KB, this increases the likelihood of a crash due to stack overflow. It may be a good idea to also increase the default thread stack size. In addition to adding -D_USE_BIG_FDS, this patch fixes two bugs. 1. _pr_poll_with_select does not make sure that osfd is less than FD_SETSIZE before calling the FD_SET macro. I think this is the problem the NSPR customer ran into and this fix alone should solve their problem. 2. We failed to increase FD_SETSIZE to 16*1024 because _PR_POLL_WITH_SELECT was defined too late. It needs to be defined on the compile command line.
Is there a way we can get the definitive answer w.r.t. dangers of using this define? Can we maybe use the Netscape Directory's relationship with HP to understand what the dangers are?
I wrote a small test and verified that stack memory is corrupted if we call PR_Poll on an osfd that is >= FD_SETSIZE, and my patch fixes the stack memory corruption.
Created attachment 111534 [details] Description of _USE_BIG_FDS in HP-UX B.10.20 Release Notes Found this with a Google search for _USE_BIG_FDS. http://wwwinfo.cern.ch/pdp/as/file/hpux10/RN05961020.html Document Id: RN05961020 Date Loaded: 08-23-96 Description: HP-UX 10.20 Release Notes, Major Changes for HP-UX 10.10, Part 2
Created attachment 111538 [details] [diff] [review] Small patch The smallest patch to fix the memory corruption problem in PR_Poll.
Created attachment 111539 [details] [diff] [review] Full patch I forgot to mention that the small patch does not define _USE_BIG_FDS. It merely fixes (what I believe is) the root cause, the memory corruption in PR_Poll. This patch does everything -- it fixes the root cause, defines _USE_BIG_FDS, and increases the default stack size for NSPR threads because of larger size of fd_set.
Attachment #111512 - Attachment is obsolete: true
I believe we at Sun need to re-evaluate if we want to ship the DBM 1.55 / NSPR 4.1.3 / NSS 3.3.3 / JSS 220.127.116.11 combination as it presently is, with the -D_USE_BIG_FDS set, but without Wan-Teh's patch. I am afraid we might have introduced a new bug, and will have to ship a complete patch at least for HPUX in a short time.
Created attachment 112098 [details] [diff] [review] Full patch for the NSPR tip (4.3) I've checked in this patch on the NSPR tip.
Do you still need this fixed in NSPR 4.1.3?
I would like it to be checked into the branch, but it is too late for 4.1.3. 4.1.3 was released 1/9, it has the _USE_BIG_FDS set, but not the patch. The customer has been informed about the potential problems and has agreed to run specific tests. I would like to put it into the patch that we will be doing in a short time.
Do you mean you want the "full patch" (attachment 111539 [details] [diff] [review]) checked into the NSPRPUB_RELEASE_4_1_BRANCH? Are you planning to do a NSPR 4.1.4 patch release?
I don't think we will call it NSPR 4.1.4, there will be a solarispatch to NSPR 4.1.3 / NSS 3.3.3 and JSS 18.104.22.168. All components need to be build seperately, but they will be patched together to reduce work for solaris users / administrators. Since we need to rebuild anyway because of the NSS 3.3.3 memoryleaks, it is hardly more work to also rebuild NSPR with it; I am concerned about the possible memory corruption, having set the USE_BIG_FDS but not the rest of the changes from attachment 111534 [details] I can't make the decision if that will be on the NSPRPUB_RELEASE_4_1_BRANCH (which would be my preference though), or if we should create a new branch off NSPR_4_1_3_RTM, I think it is up to Michael and Wan-Teh to decide that.
I am confused about what you plan to do. Sun released NSPR 4.1.3, and there will be a "solarispatch" to NSPR 4.1.3, which won't be called NSPR 4.1.4. What will it be called? NSPR 22.214.171.124? (We don't use four-level version numbers.) Another issue: the cvs tag for NSPR 4.1.x should be NSPRPUB_RELEASE_4_1_x, not NSPRPUB_RELEASE_4_1_x_RTM or NSPR_4_1_x_RTM. If you are making Sun's own NSPR release, please use a cvs tag that clearly indicate it's a Sun-only release. You are welcome to make official NSPR releases off the NSPRPUB_RELEASE_4_1_BRANCH as long as I've approved the contents.
> I am confused about what you plan to do. I am too. This is why I think Michael should decide what to do, discuss it with you and then tell me. > Sun released NSPR 4.1.3, and there will be a "solarispatch" > to NSPR 4.1.3, which won't be called NSPR 4.1.4. What will > it be called? NSPR 126.96.36.199? (We don't use four-level version > numbers.) I have no idea what it will be called. All I know is that is is a patch and not a release, which is technically a similar thing but politically different. My prefered name would be NSS 3.3.3 SP1 / NSPR 4.1.3 SP1 / JSS 188.8.131.52 SP1. When all components are built they will be bundled up with the buildpatch, and then Solaris users can install them. > Another issue: the cvs tag for NSPR 4.1.x should be > NSPRPUB_RELEASE_4_1_x, not NSPRPUB_RELEASE_4_1_x_RTM or > NSPR_4_1_x_RTM. Sorry about the NSPRPUB_RELEASE_4_1_3_RTM I could rename it to NSPRPUB_RELEASE_4_1_3 if that helps? I made both tags, since everything in the Component Pack needs to have matching names and tags. Since the NSPR directory in the component pack is called NSPR_4_1_3_RTM and not as the individual component which is still called nspr20/v4.1.3 I made an additional tag. > If you are making Sun's own NSPR release, > please use a cvs tag that clearly indicate it's a > Sun-only release. > You are welcome to make official NSPR > releases off the NSPRPUB_RELEASE_4_1_BRANCH as long as I've > approved the contents. I thought you had approved the contents, and dissaproved my 3 other checkins? Remember that I had to copy 3 different not approved files into the build tree because we needed to get the release done?
The NSPR 4.1.3 release you did at Sun is not an official NSPR release because it contains changes not checked into the NSPRPUB_RELEASE_4_1_BRANCH. To avoid confusion, I will skip the 4.1.3 version number. The next official release off the NSPRPUB_RELEASE_4_1_BRANCH will be 4.1.4. Regarding the _USE_BIG_FDS change, I asked for more information so that I could nail down the real bug. Because I couldn't get any additional information from the customer, I had to resort to code inspection and writing my own test case to reproduce the problem, which took more time. Regarding the solaris package changes, I am just not qualified to review them. In the future I will approve any solaris package change you need to make.
I checked in the "full patch" (attachment 111539 [details] [diff] [review]) on the NSPRPUB_RELEASE_4_1_BRANCH. Note that I set the NSPR version to "4.1.4 Beta" on the NSPRPUB_RELEASE_4_1_BRANCH because 4.1.3 has been released. Marked the bug fixed. This bug has been fixed on the NSPR trunk (4.3) and NSPRPUB_RELEASE_4_1_BRANCH.
Status: NEW → RESOLVED
Last Resolved: 15 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.