Closed
Bug 685592
Opened 13 years ago
Closed 13 years ago
File descriptor leak after "service httpd reload"
Categories
(NSS :: Libraries, defect, P2)
Tracking
(Not tracked)
RESOLVED
FIXED
3.13
People
(Reporter: elio.maldonado.batiz, Assigned: rrelyea)
Details
Attachments
(1 file, 2 obsolete files)
As originally reported aginst RHEL 5 Pavel Moravec 2011-06-15 03:55:12 EDT Description of problem: When mod_nss of version >=1.0.8 installed, service httpd restart causes FD leak - some descriptors for pipes are not properly closed. Version-Release number of selected component (if applicable): latest httpd + mod_nss-1.0.3-4.el5 - no issue latest httpd + mod_nss-1.0.3-8.el5.x86_64 - No issue. latest httpd + mod_nss-1.0.8-3.el5.x86_64 - Issue is seen latest httpd + mod_nss-1.0.8-4.el5_6.1.x86_64 - Issue is seen How reproducible: 100% Steps to Reproduce: 1. install mod_nss (of version at least 1.0.8-3) and httpd 2. service httpd start 3. Check FD count for httpd processes 4. service httpd reload 5. Check FD count for httpd processes Actual results: Number of FD increased by 198 (e.g. from 1686 to 1884). Any repetition of step 4. causes the same increase. Expected results: FD leaks doesn't happen. Additional info: Due to dependency of the FD leak on mod_nss version, the bug seems to be introduced by rebasing mod_nss to 1.0.8 (#588858). nss_engine_init.c seems be the core stuff, nss_init_Child, nss_init_ModuleKill and nss_init_ChildKill in particular.
Reporter | ||
Comment 1•13 years ago
|
||
Additional info he bug is in nss instead of mod_nss. service http reload causes SSL_ShutdownServerSessionIDCache and SSL_ConfigMPServerSIDCache being called. However SSL_ShutdownServerSessionIDCache and CloseCache called from it does not destroy mutexes, as everInherited variable is set to true there (and it can be set to PR_FALSE only in InitCache method, nowhere else). Proposed patch (also attached): # cat nss-3.12.8-fdleak.patch diff -up nss-3.12.8/mozilla/security/nss/lib/ssl/sslsnce.c.fdleak nss-3.12.8/mozilla/security/nss/lib/ssl/sslsnce.c --- nss-3.12.8/mozilla/security/nss/lib/ssl/sslsnce.c.fdleak 2011-06-29 12:45:17.000000000 -0400 +++ nss-3.12.8/mozilla/security/nss/lib/ssl/sslsnce.c 2011-06-29 12:46:05.000000000 -0400 @@ -1379,6 +1379,7 @@ SSL_ConfigServerSessionIDCache( int SECStatus SSL_ShutdownServerSessionIDCacheInstance(cacheDesc *cache) { + cache->sharedCache->everInherited=PR_FALSE; CloseCache(cache); return SECSuccess; } # This a naive patch attempt and will show a better one.
Reporter | ||
Comment 2•13 years ago
|
||
Bob Relyea 2011-08-22 19:35:38 EDT So the entire purpose of the 'everInherited' flag is to prevent wanton destruction of sslMutex locks by one process that closes out while other processes continue to run. The patch basically defeats that purpose (see comment at line 1030 of sslsnce.c). That being said, sslMutex_Destroy on Unix platforms does the following: 1) close our process's copies of the pipe used to do the locking. 2) clears out the Mutex state in our copy of the mutex. The down side here is what if Mutex is in shared memory -- which we see is the case in line 1253 of sslsnce.c. When we clear out the state, we clear out the state for all Mutexes. If a single process shuts down, then all the mutexes will go way, so that means this patch, as is, is unacceptable. We need a new function, sslMutex_Free() which frees this process's resources associated with the mutex, but leaves the rest of the mutext alone. This can be called when everInherited is set.
Reporter | ||
Comment 3•13 years ago
|
||
Created attachment 519371 [details]
Patch with does not clobber the mutex, but only frees process specific
resources.
This patch accomplishes the goals of the previous patches without breaking
certain instances where a server may shutdown only one of it's processes and
keep the other processes running.
---------
I'll attach Bob's patch next.
Reporter | ||
Comment 4•13 years ago
|
||
This Bob Relyea's patch for nss 3.12.10 code.
Reporter | ||
Updated•13 years ago
|
Attachment #559214 -
Flags: review?(nelson)
Reporter | ||
Comment 5•13 years ago
|
||
Comment on attachment 559214 [details] [diff] [review] Patch with does not clobber the mutex, but only frees process specific resources. Adding Wan-Teh in case Nelson is tied up at the moment.
Attachment #559214 -
Flags: superreview?(wtc)
Updated•13 years ago
|
Assignee: nobody → emaldona
Priority: -- → P2
Target Milestone: --- → 3.13
Comment 6•13 years ago
|
||
Comment on attachment 559214 [details] [diff] [review] Patch with does not clobber the mutex, but only frees process specific resources. Elio: thanks for the patch. I suggest some changes. 1. sslmutex.c >+j >+SECStatus >+sslMutex_Free(sslMutex *pMutex) Remove the "j" before the "SECStatus" line. It is unfortunate that sslMutex_Free has to duplicate a lot of code, but I discovered that it requires refactoring to share code with sslMutex_Destroy. It may be worth doing though. I believe the Windows version of sslMutex_Free should call sslMutex_2LevelDestroy because pMutex->u.sslLock (a PRLock) is a process-owned resource. 2. sslmutex.h > extern SECStatus sslMutex_Destroy(sslMutex *sem); > >+/* free process owned resources only */ >+extern SECStatus sslMutex_Free(sslMutex *sem); The comment should also say "without modifying shared memory". Nit: we should pick a better name because "Free" and "Destroy" are too similar. Perhaps "DestroyInherited" or "FreeInherited"? 3. sslsnce.c > ** be) in use by multiple processes. We do not wish to destroy > ** the mutexes while they are still in use. > */ This comment should be updated, and moved inside the for loop below, because it describes the cache->sharedCache->everInherited check. > for (; locks_initialized > 0; --locks_initialized, ++pLock ) { >- sslMutex_Destroy(&pLock->mutex); >+ if (PR_TRUE == cache->sharedCache->everInherited) { >+ /* free process owned resources ownly */ >+ sslMutex_Free(&pLock->mutex); >+ } else { >+ sslMutex_Destroy(&pLock->mutex); >+ } Remove "PR_TRUE == ". Fix the typo "ownly".
Attachment #559214 -
Flags: superreview?(wtc) → superreview-
Comment 7•13 years ago
|
||
Comment on attachment 559214 [details] [diff] [review] Patch with does not clobber the mutex, but only frees process specific resources. In sslmutex.c, there are two more implementations of sslMutex_Destroy: http://mxr.mozilla.org/security/ident?i=sslMutex_Destroy I think we should add two more implementations of sslMutex_Free to avoid undefined symbols on those platforms.
Comment 8•13 years ago
|
||
Elio, Bob: when a process terminates, all the file descriptors are closed automatically by the OS. So could you clarify what you meant by the FD leak? I guess a tool such as Purify or valgrind reports the FD leak, but the FDs are still closed by the OS, right?
Updated•13 years ago
|
Attachment #559214 -
Attachment is patch: true
Reporter | ||
Updated•13 years ago
|
Assignee: emaldona → rrelyea
Assignee | ||
Comment 9•13 years ago
|
||
The leak is coming when the main process shuts everything down then restarts. The process itself never shuts down, So the file descriptors hang around forever. In the actual application, this happens when we do an httpd restart, which happens whenever the customer changes the configuration of server. Eventually the sever runs out of file descriptors after a number of restarts. Unfortunately it's not a valgrind leak, which I would have actively ignored. bob
Assignee | ||
Comment 10•13 years ago
|
||
> I believe the Windows version of sslMutex_Free should > call sslMutex_2LevelDestroy because pMutex->u.sslLock > (a PRLock) is a process-owned resource. With pMutex->u.sslLock in shared memory, so are you sure? The init code does the following... PR_ASSERT(sem); if (sem) { /* we need to reset the sslLock in the children or the single_process init function below will assert */ sem->u.sslLock = NULL; } return single_process_sslMutex_Init(sem); This means there is only one Mutex (the last one created). If any process frees it, the rest of the processes are hosed if they try to use it. > It is unfortunate that sslMutex_Free has to duplicate > a lot of code, but I discovered that it requires > refactoring to share code with sslMutex_Destroy. > It may be worth doing though. right. I discovered the same thing, (my initial attempt was to have sslMutex_Destroy call sslMutex_Free. The refactor got pretty ugly pretty quickly, so I preferred the simpler patch (particularly since the duplicated code wasn't all the much code). > The comment should also say "without modifying shared memory". OK.. > Nit: we should pick a better name because "Free" and "Destroy" >are too similar. Perhaps "DestroyInherited" or "FreeInherited"? I think Destroy is fine. I'm open for suggestions for a better name for the 'Free' call. > Remove "PR_TRUE == ". I prefer the style with PR_TRUE == removed, I was just trying to follow the prevailing style in lib/ssl (though I now see it's a mix). > Fix the typo "ownly". ok > I think we should add two more implementations of sslMutex_Free to > avoid undefined symbols on those platforms. Ok
Assignee | ||
Comment 11•13 years ago
|
||
Incorporate wtc's review comments: 1) I've merged sslMutex_Free back into sslMutex_Destroy, and gave sslMutex_Destroy a single boolean to tell us whether or not to destroy the whole mutex, or just the process local pieces. This solved 2 of wtc's issues: duplicate code and naming. It also helped with the 2 othter sslMutex_Destroy functions... In one case we simply needed to add the parameter. 2) I've maintained not calling the level2 mutex destroy for the the reasons outlined in my comment 10. 3) The call to sslMutex_Destroy in sslsnce.c no longer needs to actually check toe state of the everInherited flag, is simply passes it to mutex_Destroy which treats is at the localProcess flag. 4) I've updated the comment.
Attachment #559214 -
Attachment is obsolete: true
Attachment #559214 -
Flags: review?(nelson)
Attachment #563795 -
Flags: review?(wtc)
Comment 12•13 years ago
|
||
Comment on attachment 563795 [details] [diff] [review] V2: Patch does not clober the mutex, only frees process specific resources Thanks for the patch. The patch has one bug. You can check in after fixing the bug. 1. lib/ssl/sslmutex.c > #ifdef WINNT >- /* on NT, get rid of the PRLock used for fibers within a process */ >- retvalue = sslMutex_2LevelDestroy(pMutex); >+ if (processLocal) { >+ /* Even though the NT mutex is created by each process, we threw away all but >+ * one mutex. This means we can't destroy the one remaining one. */ >+ /* On NT, get rid of the PRLock used for fibers within a process. */ >+ retvalue = sslMutex_2LevelDestroy(pMutex); >+ } > #endif BUG: please revert this change. The only thing sslMutex_2LevelDestroy does is a PR_DestroyLock call. Here is the relevant code: http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/security/nss/lib/ssl/sslmutex.h&rev=1.12&mark=74,79#66 http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/security/nss/lib/ssl/sslmutex.c&rev=1.25&mark=362,364#362 http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/security/nss/lib/ssl/sslmutex.c&rev=1.25&mark=56,64#56 NSPR locks are process local. Also, I believe pMutex does NOT point to shared memory for WINNT, as this code in sslsnce.c shows: http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/security/nss/lib/ssl/sslsnce.c&rev=1.56&mark=1627,1633-1637,1644,1647-1649#1627 >+ /* are semephores global resources. See SEM_DESTROY(3) man page */ Nit: change "are semephores" to "semephores are". 2. lib/ssl/sslsnce.c >+ sslMutex_Destroy(&pLock->mutex,cache->sharedCache->everInherited); Nit: Add a space after the comma (,).
Attachment #563795 -
Flags: review?(wtc) → review-
Assignee | ||
Comment 13•13 years ago
|
||
Checking in sslmutex.c; /cvsroot/mozilla/security/nss/lib/ssl/sslmutex.c,v <-- sslmutex.c new revision: 1.26; previous revision: 1.25 done Checking in sslmutex.h; /cvsroot/mozilla/security/nss/lib/ssl/sslmutex.h,v <-- sslmutex.h new revision: 1.13; previous revision: 1.12 done Checking in sslsnce.c; /cvsroot/mozilla/security/nss/lib/ssl/sslsnce.c,v <-- sslsnce.c new revision: 1.57; previous revision: 1.56 done Checked in will all wtc's suggested changes.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Comment 14•13 years ago
|
||
I fixed comment problems in Bob's checkin. Checking in sslmutex.c; /cvsroot/mozilla/security/nss/lib/ssl/sslmutex.c,v <-- sslmutex.c new revision: 1.27; previous revision: 1.26 done Checking in sslsnce.c; /cvsroot/mozilla/security/nss/lib/ssl/sslsnce.c,v <-- sslsnce.c new revision: 1.58; previous revision: 1.57 done This patch is a cumulative patch of what's checked in.
Attachment #563795 -
Attachment is obsolete: true
You need to log in
before you can comment on or make changes to this bug.
Description
•