Closed Bug 367376 Opened 19 years ago Closed 18 years ago

selfserv thread stacks allocated by pthread_create reportedly leaked on Linux

Categories

(NSS :: Tools, defect)

3.11.4
x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: slavomir.katuscak+mozilla, Assigned: julien.pierre)

Details

(Keywords: memory-leak)

Attachments

(1 file)

Possible leaks when launching threads in Selfserv, detected by Valgrind on Linux. /usr/bin/valgrind --tool=memcheck --leak-check=yes --show-reachable=yes selfserv -D -p 8443 -d /share/builds/mccrel3/security/securitytip/builds/20070117.1/biarritz_Solaris10_amd64/mozilla/tests_results/security/nssamdrhel3.6/server_memleak -n nssamdrhel3.red.iplanet.com -e nssamdrhel3.red.iplanet.com-ec -w nss -c ABCDEF:C001:C002:C003:C004:C005:C006:C007:C008:C009:C00A:C00B:C00C:C00D:C00E:C00F:C010:C011:C012:C013:C014cdefgijklmnvyz -t 5 ==4020== 204 bytes in 3 blocks are possibly lost in loss record 5 of 7 ==4020== at 0x442AC82: calloc (vg_replace_malloc.c:279) ==4020== by 0x410EDEB: _dl_allocate_tls (in /lib/ld-2.3.2.so) ==4020== by 0x45C826D: allocate_stack (in /lib/tls/libpthread-0.60.so) ==4020== by 0x45C7EF7: pthread_create@@GLIBC_2.1 (in /lib/tls/libpthread-0.60.so) ==4020== by 0x45905DE: _PR_CreateThread (ptthread.c:455) ==4020== by 0x45907CE: PR_CreateThread (ptthread.c:538) ==4020== by 0x804CBC5: launch_threads (selfserv.c:570) ==4020== by 0x804F933: main (selfserv.c:2012) This can be problem of selfserv, NSPR or glibc, or maybe just confused Valgrind.
The same problem detected in Strsclnt. /usr/bin/valgrind --tool=memcheck --leak-check=yes --show-reachable=yes strsclnt -q -p 8443 -d /share/builds/mccrel3/security/securitytip/builds/20070117.1/biarritz_Solaris10_amd64/mozilla/tests_results/security/nssamdrhel3.6/client_memleak -w nss -c 1000 -C A nssamdrhel3.red.iplanet.com ==9755== 204 bytes in 3 blocks are possibly lost in loss record 8 of 9 ==9755== at 0x442AC82: calloc (vg_replace_malloc.c:279) ==9755== by 0x410EDEB: _dl_allocate_tls (in /lib/ld-2.3.2.so) ==9755== by 0x45C826D: allocate_stack (in /lib/tls/libpthread-0.60.so) ==9755== by 0x45C7EF7: pthread_create@@GLIBC_2.1 (in /lib/tls/libpthread-0.60.so) ==9755== by 0x45905DE: _PR_CreateThread (ptthread.c:455) ==9755== by 0x45907CE: PR_CreateThread (ptthread.c:538) ==9755== by 0x804C431: launch_thread (strsclnt.c:472) ==9755== by 0x804DF17: client_main (strsclnt.c:1273) ==9755== by 0x804E76C: main (strsclnt.c:1468)
On RHAS4 in OPT build there is missing PR_CreateThread (ptthread.c:538) line on stack output (probably build optimalization). This patch adds new stacks of this bug to list of known leaks to prevent Tinderbox failures.
Trunk: Checking in ignored; /cvsroot/mozilla/security/nss/tests/memleak/ignored,v <-- ignored new revision: 1.5; previous revision: 1.4 done
Branch: Checking in ignored; /cvsroot/mozilla/security/nss/tests/memleak/ignored,v <-- ignored new revision: 1.1.2.5; previous revision: 1.1.2.4 done
Keywords: mlk
Summary: Possible leaks when launching threads in Selfserv. → selfsrev thread stacks allocated by pthread_create reportedly leaked on Linux
I found similar leaks today also on Solaris Tinderbox (securityjes5 memleak SunOS/sparc 32 bit makemoney). It occured there only in one run, on selfserv in FIPS mode with libfreebl_32fpu_3. Memory Leak (mel): Found leaked block of size 100 bytes at address 0x136ee8 At time of allocation, the call stack was: [1] calloc() at 0xdf53ecf4 [2] PR_Calloc() at line 475 in "prmem.c" [3] _PR_CreateThread() at line 381 in "ptthread.c" [4] PR_CreateThread() at line 539 in "ptthread.c" [5] launch_threads() at line 574 in "selfserv.c" [6] main() at line 2072 in "selfserv.c" Memory Leak (mel): Found leaked block of size 40 bytes at address 0x136f68 At time of allocation, the call stack was: [1] calloc() at 0xdf53ecf4 [2] PR_Calloc() at line 475 in "prmem.c" [3] _PR_CreateThread() at line 427 in "ptthread.c" [4] PR_CreateThread() at line 539 in "ptthread.c" [5] launch_threads() at line 574 in "selfserv.c" [6] main() at line 2072 in "selfserv.c"
OS: Linux → All
Hardware: PC → All
Summary: selfsrev thread stacks allocated by pthread_create reportedly leaked on Linux → selfserv thread stacks allocated by pthread_create reportedly leaked on Linux
It's possible that the server or client are exiting from a thread other than the primordial, before all the other threads have been joined. This could explain explain the thread stacks being reported as leaked. It's not necessarily a bug.
Assignee: nobody → julien.pierre.boogz
Slavo, Do we have a thread launching test in NSPR ? If so, please try to run it under valgrind and find out if you are getting the same leaks on Linux. Regarding the Solaris leak reported in comment 5, it's more likely that the server exited without waiting for other threads. I think this might happen if a SIGUSR1 is sent to end the process.
Slavo, Also, please open a different bug for the solaris leak than for the Linux leak. They are not the same leak since the stacks are different. The Solaris stack does not include a pthread function.
OS: All → Linux
Hardware: All → PC
Julien, > Do we have a thread launching test in NSPR ? If so, please try to run it under > valgrind and find out if you are getting the same leaks on Linux. Yes, we have there test called thread. I ran it under Valgrind with the same result: ==10309== 136 bytes in 2 blocks are possibly lost in loss record 9 of 10 ==10309== at 0x43CEC82: calloc (vg_replace_malloc.c:279) ==10309== by 0x410EE1B: _dl_allocate_tls (in /lib/ld-2.3.2.so) ==10309== by 0x444F34D: allocate_stack (in /lib/tls/libpthread-0.60.so) ==10309== by 0x444EEC7: pthread_create@@GLIBC_2.1 (in /lib/tls/libpthread-0.60.so) ==10309== by 0x440ED79: _PR_CreateThread (../../../../pr/src/pthreads/ptthread.c:456) ==10309== by 0x440EF71: PR_CreateThread (../../../../pr/src/pthreads/ptthread.c:540) ==10309== by 0x8048CB2: DumbThread (../../../pr/tests/threads.c:72) ==10309== by 0x440E709: _pt_root (../../../../pr/src/pthreads/ptthread.c:221) ==10309== by 0x444EDD7: start_thread (in /lib/tls/libpthread-0.60.so) ==10309== by 0x453BD19: clone (in /lib/tls/libc-2.3.2.so) > Regarding the Solaris leak reported in comment 5, it's more likely that the > server exited without waiting for other threads. I think this might happen if a > SIGUSR1 is sent to end the process. It's possible. > Also, please open a different bug for the solaris leak than for the Linux leak. > They are not the same leak since the stacks are different. The Solaris stack > does not include a pthread function. I just checked patterns in ignored file and we don't have pattern matching Solaris stack there. Seems that it was never reproduced on Solaris (only in this one run when I reported it), so I would rather wait with opening new bug, until it occurs again, do you agree with this ?
Salvo, Your test on Linux in comment 9 shows that this is is a pthread/glibc bug, IMO. It might be worth trying this on a newer version of Linux or looking in Linux bug dbs to see if this is fixed. Regardless, this isn't something we can fix in NSPR. I recommend you add this stack to your ignore stack list. I agree about holding off to file another bug for Solaris.
Julien, The pattern in ignored stack list is: **/_PR_CreateThread/pthread_create@@GLIBC_2.1/** Leak was found in RHEL3, and reproduced also in RHEL4.
cc'ing Bob from RedHat. Perhaps there is a RHEL3/4 patch that fixes this pthread_create leak . If so, we should install the patch on our RHEL systems and remove the entry from the ignored leaks list.
Here's the run on RHEL5: bobs-laptop(209) valgrind --leak-check=full --show-reachable=yes valgrind: no program specified valgrind: Use --help for more information. bobs-laptop(210) valgrind --leak-check=full --show-reachable=yes threads ==31087== Memcheck, a memory error detector. ==31087== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==31087== Using LibVEX rev 1658, a library for dynamic binary translation. ==31087== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==31087== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==31087== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==31087== For more details, rerun with: -v ==31087== PASS ==31087== ==31087== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 19 from 1) ==31087== malloc/free: in use at exit: 2,346 bytes in 27 blocks. ==31087== malloc/free: 18,104 allocs, 18,077 frees, 1,966,578 bytes allocated. ==31087== For counts of detected errors, rerun with: -v ==31087== searching for pointers to 27 not-freed blocks. ==31087== checked 87,076 bytes. ==31087== ==31087== 6 bytes in 1 blocks are still reachable in loss record 1 of 4 ==31087== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==31087== by 0x40D4D1F: strdup (in /lib/libc-2.5.so) ==31087== by 0x4021017: _PR_InitLinker (prlink.c:321) ==31087== by 0x4028FD7: _PR_InitStuff (prinit.c:241) ==31087== by 0x4029010: _PR_ImplicitInitialization (prinit.c:259) ==31087== by 0x4029031: PR_Init (prinit.c:310) ==31087== by 0x8048CAC: main (threads.c:177) ==31087== ==31087== ==31087== 12 bytes in 1 blocks are still reachable in loss record 2 of 4 ==31087== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==31087== by 0x4022A63: PR_Malloc (prmem.c:467) ==31087== by 0x4028AD8: PR_ErrorInstallTable (prerrortable.c:204) ==31087== by 0x402852F: nspr_InitializePRErrorTable (prerr.c:128) ==31087== by 0x4028FF5: _PR_InitStuff (prinit.c:248) ==31087== by 0x4029010: _PR_ImplicitInitialization (prinit.c:259) ==31087== by 0x4029031: PR_Init (prinit.c:310) ==31087== by 0x8048CAC: main (threads.c:177) ==31087== ==31087== ==31087== 20 bytes in 1 blocks are still reachable in loss record 3 of 4 ==31087== at 0x40046FF: calloc (vg_replace_malloc.c:279) ==31087== by 0xDF733B: _dlerror_run (in /lib/libdl-2.5.so) ==31087== by 0xDF6B83: dlopen@@GLIBC_2.1 (in /lib/libdl-2.5.so) ==31087== by 0x4021E9F: pr_FindSymbolInProg (prmem.c:130) ==31087== by 0x4021F00: _PR_InitZones (prmem.c:186) ==31087== by 0x4028E5A: _PR_InitStuff (prinit.c:176) ==31087== by 0x4029010: _PR_ImplicitInitialization (prinit.c:259) ==31087== by 0x4029031: PR_Init (prinit.c:310) ==31087== by 0x8048CAC: main (threads.c:177) ==31087== ==31087== ==31087== 2,308 bytes in 24 blocks are still reachable in loss record 4 of 4 ==31087== at 0x40046FF: calloc (vg_replace_malloc.c:279) ==31087== by 0x4022AC5: PR_Calloc (prmem.c:474) ==31087== by 0x4020A39: _PR_InitTPD (prtpd.c:96) ==31087== by 0x4028F59: _PR_InitStuff (prinit.c:204) ==31087== by 0x4029010: _PR_ImplicitInitialization (prinit.c:259) ==31087== by 0x4029031: PR_Init (prinit.c:310) ==31087== by 0x8048CAC: main (threads.c:177) ==31087== ==31087== LEAK SUMMARY: ==31087== definitely lost: 0 bytes in 0 blocks. ==31087== possibly lost: 0 bytes in 0 blocks. ==31087== still reachable: 2,346 bytes in 27 blocks. ==31087== suppressed: 0 bytes in 0 blocks.
That Red Hat bug report says the pthread library is the old LinuxThreads (used in RHEL 2.1), not the new NPTL used in RHEL3 and RHEL4. In comment 11 Slavo says he observed the leaks on RHEL 3/4.
Thanks, Bob. I'm not sure if that bug you referenced points to the same leak. I reproduced the same leak on both RHEL3 and RHEL4. I believe our official builds are done on either RHEL 2.1 or 3. From what I know this causes an older compatibility glibc to be used. Perhaps that version is unpatched. I then built NSPR directly on RHEL4 to avoid using any compatibility glibc library, but valgrind still showed the pthread_create leak in the threads test.
The leaks shown in comment 13 are clearly NOT the leaks of thread stacks that are the subject of this bug.
Nelson, Correct. Bob's run was on RHEL5, which doesn't have the leak. We have been trying to determine when it was fixed and if patches are available for RHEL3 or RHEL4 that contain this fix. The bug Bob pointed to was fixed in glibc-2.3.2-101.4 . Our RHEL4 box has libc-2.3.4-2.25 And RHEL3 : glibc-2.3.2-95.37 So if Bob found the right bug, we would expect the RHEL4 box to have the fix, but not the RHEL3 bug. But neither has the fix. My conclusion is that this pthread_create leak was fixed in RHEL5 only as part of another bug id. I don't think it's worth wasting anymore time on this. We can just leave this leak stack in the ignore file. If everybody agrees, I will mark the bug as INVALID. I think it's the proper resolution since it's an OS problem and not an NSPR/NSS bug.
Report this leak to https://bugzilla.redhat.com/.
I opened a bug on bugzilla.redhat.com at https://bugzilla.redhat.com/show_bug.cgi?id=432721 . I am marking this bug INVALID since this is a RedHat Linux leak, and not an NSS bug. Slavo has already marked the stack in our ignore stack leak file. Hopefully RedHat will come up with a patch we can install on our systems to remove this OS leak.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → INVALID
Adding one more stack related to this bug, found on one Tinderbox machine: **/_PR_CreateThread/PR_Calloc/** Checking in ignored; /cvsroot/mozilla/security/nss/tests/memleak/ignored,v <-- ignored new revision: 1.79; previous revision: 1.78 done
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: