Closed Bug 188439 Opened 17 years ago Closed 17 years ago

_USE_BIG_FDS flag needed on HPUX 11i

Categories

(NSPR :: NSPR, defect, P1)

4.1.3
HP
HP-UX
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: sonja.mirtitsch, Assigned: wtc)

Details

Attachments

(4 files, 1 obsolete file)

Directoryserver ran out of file descriptors on HPUX 11i. The problem went away
after they recompiled all their components with this flag.
Nobody ever narrowed down the problem to NSPR, but it was suspected that NSPR,
DBM and NSS need this flag.

Sun's version of NSPR 4.1.3 was build with this flag set 

From the HP-UX Release Notes: "This feature is optional because it will
not be often used and to minimize the impact on existing code.  An application
requiring more than 2048 file descriptors must be recompiled with the new symbol
_USE_BIG_FDS defined.  To do this, add the flag D_USE_BIG_FDS to the compile
command in the application's makefile.  Or, the symbol can be defined at the
beginning of every application source file (via #define _USE_BIG_FDS)...."



Index: config/HP-UX.mk
===================================================================
RCS file: /cvsroot/mozilla/nsprpub/config/HP-UX.mk,v
retrieving revision 3.12.2.2
diff -r3.12.2.2 HP-UX.mk
78a79,82
> ifeq ($(OS_RELEASE),B.11.11) #sonmi
> OS_CFLAGS               += -D_USE_BIG_FDS
> endif
> 

see also bug #184514


I have been unable to get more information about the problem
I seem to have found a place in PR_Poll where it
calls the FD_SET macro before checking that the
fd is less than FD_SETSIZE.  This would cause
memory corruption because NSPR does not increase
FD_SETSIZE to a larger value, which is another
bug.

Could you ask the NSPR customer if it was PR_Poll
that was having the problem?
Attached patch Initial patch (obsolete) — Splinter Review
-D_USE_BIG_FDS will increase the stack memory usage of
PR_Poll by about 21KB.	Since NSPR's default thread
stack is 64KB, this increases the likelihood of a crash
due to stack overflow.	It may be a good idea to also
increase the default thread stack size.

In addition to adding -D_USE_BIG_FDS, this patch fixes
two bugs.

1. _pr_poll_with_select does not make sure that osfd
is less than FD_SETSIZE before calling the FD_SET macro.
I think this is the problem the NSPR customer ran into
and this fix alone should solve their problem.

2. We failed to increase FD_SETSIZE to 16*1024 because
_PR_POLL_WITH_SELECT was defined too late.  It needs to
be defined on the compile command line.
Is there a way we can get the definitive answer w.r.t. dangers of using this 
define?  Can we maybe use the Netscape Directory's relationship with HP to 
understand what the dangers are?
I wrote a small test and verified that stack memory is
corrupted if we call PR_Poll on an osfd that is >=
FD_SETSIZE, and my patch fixes the stack memory
corruption.

Found this with a Google search for _USE_BIG_FDS.

http://wwwinfo.cern.ch/pdp/as/file/hpux10/RN05961020.html

Document Id: RN05961020
Date Loaded: 08-23-96

Description: HP-UX 10.20 Release Notes, Major Changes for HP-UX 10.10, Part 2
Attached patch Small patchSplinter Review
The smallest patch to fix the memory corruption
problem in PR_Poll.
Attached patch Full patchSplinter Review
I forgot to mention that the small patch does not
define _USE_BIG_FDS.  It merely fixes (what I
believe is) the root cause, the memory corruption
in PR_Poll.

This patch does everything -- it fixes the root
cause, defines _USE_BIG_FDS, and increases the
default stack size for NSPR threads because of
larger size of fd_set.
Attachment #111512 - Attachment is obsolete: true
I believe we at Sun need to re-evaluate if we want to ship the DBM 1.55 / NSPR
4.1.3 / NSS 3.3.3 / JSS 3.1.2.2 combination as it presently is, with the
-D_USE_BIG_FDS set, but without Wan-Teh's patch. 
I am afraid we might have introduced a new bug, and will have to ship a complete
patch at least for HPUX in a short time. 
Priority: -- → P1
Target Milestone: --- → 4.3
I've checked in this patch on the NSPR tip.
Do you still need this fixed in NSPR 4.1.3?
I would like it to be checked into the branch, but it is too late for 4.1.3.

4.1.3 was released 1/9, it has the _USE_BIG_FDS set, but not the patch. The
customer has been informed about the potential problems and has agreed to run
specific tests.

I would like to put it into the patch that we will be doing in a short time. 
Do you mean you want the "full patch" (attachment 111539 [details] [diff] [review])
checked into the NSPRPUB_RELEASE_4_1_BRANCH?

Are you planning to do a NSPR 4.1.4 patch release?
I don't think we will call it NSPR 4.1.4, there will be a solarispatch to NSPR
4.1.3 / NSS 3.3.3 and JSS 3.1.2.2. All components need to be build seperately,
but they will be patched together to reduce work for solaris users / administrators.

Since we need to rebuild anyway because of the NSS 3.3.3 memoryleaks, it is
hardly more work to also rebuild NSPR with it; I am concerned about the possible
memory corruption, having set the USE_BIG_FDS but not the rest of the changes
from attachment 111534 [details]

I can't make the decision if that will be on the NSPRPUB_RELEASE_4_1_BRANCH
(which would be my preference though), or if we should create a new branch off
NSPR_4_1_3_RTM, I think it is up to Michael and Wan-Teh to decide that. 


I am confused about what you plan to do.

Sun released NSPR 4.1.3, and there will be a "solarispatch"
to NSPR 4.1.3, which won't be called NSPR 4.1.4.  What will
it be called?  NSPR 4.1.3.1?  (We don't use four-level version
numbers.)

Another issue: the cvs tag for NSPR 4.1.x should be
NSPRPUB_RELEASE_4_1_x, not NSPRPUB_RELEASE_4_1_x_RTM or
NSPR_4_1_x_RTM.  If you are making Sun's own NSPR release,
please use a cvs tag that clearly indicate it's a
Sun-only release.  You are welcome to make official NSPR
releases off the NSPRPUB_RELEASE_4_1_BRANCH as long as I've
approved the contents.
> I am confused about what you plan to do.

I am too. This is why I think Michael should decide what to do, discuss it with
you and then tell me.

> Sun released NSPR 4.1.3, and there will be a "solarispatch"
> to NSPR 4.1.3, which won't be called NSPR 4.1.4.  What will
> it be called?  NSPR 4.1.3.1?  (We don't use four-level version
> numbers.)

I have no idea what it will be called. All I know is that is is a patch and not
a release, which is technically a similar thing but politically different. My
prefered name would be NSS 3.3.3 SP1 / NSPR 4.1.3 SP1 / JSS 3.1.2.2 SP1. When
all components are  built they will be bundled up with the buildpatch, and then
Solaris users can install them.

> Another issue: the cvs tag for NSPR 4.1.x should be
> NSPRPUB_RELEASE_4_1_x, not NSPRPUB_RELEASE_4_1_x_RTM or
> NSPR_4_1_x_RTM.  

Sorry about the NSPRPUB_RELEASE_4_1_3_RTM I could  rename it to
NSPRPUB_RELEASE_4_1_3 if that helps?

I made both tags, since everything in the Component Pack needs to have matching
names and tags. Since the NSPR directory in the component pack is called
NSPR_4_1_3_RTM and not as the individual component which is still called
nspr20/v4.1.3 I made an additional tag.

> If you are making Sun's own NSPR release,
> please use a cvs tag that clearly indicate it's a
> Sun-only release. 
 

> You are welcome to make official NSPR
> releases off the NSPRPUB_RELEASE_4_1_BRANCH as long as I've
> approved the contents.

I thought you had approved the contents, and dissaproved my 3 other checkins?
Remember that I had to copy 3 different not approved files into the build tree
because we needed to get the release done?
The NSPR 4.1.3 release you did at Sun is not an
official NSPR release because it contains changes
not checked into the NSPRPUB_RELEASE_4_1_BRANCH.
To avoid confusion, I will skip the 4.1.3
version number.  The next official release off
the NSPRPUB_RELEASE_4_1_BRANCH will be 4.1.4.

Regarding the _USE_BIG_FDS change, I asked for
more information so that I could nail down the
real bug.  Because I couldn't get any additional
information from the customer, I had to resort to
code inspection and writing my own test case to
reproduce the problem, which took more time.

Regarding the solaris package changes, I am just
not qualified to review them. In the future I
will approve any solaris package change you need
to make.
I checked in the "full patch" (attachment 111539 [details] [diff] [review])
on the NSPRPUB_RELEASE_4_1_BRANCH.

Note that I set the NSPR version to "4.1.4 Beta"
on the NSPRPUB_RELEASE_4_1_BRANCH because 4.1.3
has been released.

Marked the bug fixed.  This bug has been fixed on
the NSPR trunk (4.3) and NSPRPUB_RELEASE_4_1_BRANCH.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.