Open Bug 1187421 Opened 10 years ago Updated 10 months ago

Leak in error_get_my_stack

Tracking

(Not tracked)

Status:

NEW

People

(Reporter: mccr8, Unassigned)

References

(Blocks 2 open bugs)

Details

(Keywords: memory-leak)

Attachments

(2 files)

fix-nspr-leaks.patch 9 years ago Franziskus Kiefer [:franziskus] 2.32 KB, patch	KaiE : review-	Details \| Diff \| Splinter Review
fix-error-leak-nss.patch 9 years ago Franziskus Kiefer [:franziskus] 1.15 KB, patch	ttaubert : review+ franziskus : checked-in+	Details \| Diff \| Splinter Review

Andrew McCreight [:mccr8]

Reporter

Description

•

10 years ago

LeakSanitizer detected this small leak in a Firefox content process: Direct leak of 72 byte(s) in 1 object(s). #0 0x4750f1 in calloc /builds/slave/moz-toolchain/src/llvm/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:90 #1 0x7ff63c019890 in error_get_my_stack src/security/nss/lib/base/error.c:109 #2 0x7ff63c019c38 in nss_ClearErrorStack src/security/nss/lib/base/error.c:245 #3 0x7ff63c0182f8 in NSSArena_Create src/security/nss/lib/base/arena.c:352 #4 0x7ff63bff452b in NSSTrustDomain_Create src/security/nss/lib/pki/trustdomain.c:34 error_get_my_stack allocates a new stack and then calls PR_SetThreadPrivate() on it, so I'm guessing the thread private data isn't being freed somehow. It looks like that's supposed to happen in nss_DestroyErrorStack(). This happens with dom/media/tests/mochitest/ipc/test_ipc.html, and in e10s Mochitest-1. I haven't narrowed it down any further.

Ryan VanderMeulen [:RyanVM]

Comment 1

•

9 years ago

Still reproduces in dom/crypto: https://treeherder.mozilla.org/logviewer.html#?job_id=29489030&repo=try

Ryan VanderMeulen [:RyanVM]

Comment 2

•

9 years ago

This is still pretty widespread across other mochitest suites as well.

Ryan VanderMeulen [:RyanVM]

Comment 3

•

9 years ago

Tim, is there any chance you'd be able to look at this? It's the only NSS LSAN suppression we've got at this point.

Flags: needinfo?(ttaubert)

Tim Taubert [:ttaubert] (inactive)

Comment 4

•

9 years ago

I've seen this before. The error stack is allocated whenever we hit errors. Looks like we're not calling PR_Cleanup() or whatever again the function was that cleans this up.

Flags: needinfo?(ttaubert)

Tim Taubert [:ttaubert] (inactive)

Comment 5

•

9 years ago

I'm not sure where the right place to call PR_Cleanup() would be? As long as Firefox is still using NSPR too we can't call it right after NSS_Shutdown().

Franziskus Kiefer [:franziskus]

Updated

•

9 years ago

Assignee: nobody → nobody

Component: Libraries → NSPR

Product: NSS → NSPR

Version: 3.15 → other

Franziskus Kiefer [:franziskus]

Comment 6

•

9 years ago

Attached patch fix-nspr-leaks.patch — Details — Splinter Review

I ran into this bug as well. It looks to me like there are multiple leaks here. This patch fixes them. (And hopefully doesn't break anything.)

Attachment #8838442 - Flags: review?(ted)

Franziskus Kiefer [:franziskus]

Comment 7

•

9 years ago

Attached patch fix-error-leak-nss.patch — Details — Splinter Review

This is the NSS part of the leak fix.

Assignee: nobody → franziskuskiefer

Attachment #8838444 - Flags: review?(ttaubert)

Tim Taubert [:ttaubert] (inactive)

Comment 8

•

9 years ago

Comment on attachment 8838444 [details] [diff] [review] fix-error-leak-nss.patch Review of attachment 8838444 [details] [diff] [review]: ----------------------------------------------------------------- ::: lib/base/error.c @@ +265,5 @@ > { > if (INVALID_TPD_INDEX != error_stack_index) { > PR_SetThreadPrivate(error_stack_index, NULL); > + error_stack_index = INVALID_TPD_INDEX; > + error_call_once = error_call_again; /* allow to init again */ I've seen memset(0) used to achieve the same, we do that in libssl at least. So maybe: PORT_Memset(&error_call_once, 0, sizeof(error_call_once));

Attachment #8838444 - Flags: review?(ttaubert) → review+

Andrew McCreight [:mccr8]

Reporter

Comment 9

•

9 years ago

Please remove the leak suppression for this bug from build/sanitizers/lsan_suppressions.txt, assuming that an ASan run is green with your patches.

Franziskus Kiefer [:franziskus]

Comment 10

•

8 years ago

Comment on attachment 8838444 [details] [diff] [review] fix-error-leak-nss.patch landed as https://hg.mozilla.org/projects/nss/rev/430f78b2fa77a844e446593b8d9ba747301da770

Attachment #8838444 - Flags: checked-in+

Franziskus Kiefer [:franziskus]

Comment 11

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=d177f0889c53

Franziskus Kiefer [:franziskus]

Comment 12

•

8 years ago

(In reply to Franziskus Kiefer [:fkiefer or :franziskus] from comment #11) > https://treeherder.mozilla.org/#/jobs?repo=try&revision=d177f0889c53 Looks like this doesn't fix the Firefox problem :( Something else must be broken (probably not on the NSS side as I can't reproduce the leak with only NSS anymore).

Andrew McCreight [:mccr8]

Reporter

Comment 13

•

8 years ago

Thanks for checking! Maybe the whitelist could be changed to NSSTrustDomain_Create, which sounds less generic. It looks like the problem is much less common than described in comment 0, so I think a bunch of stuff was fixed.

Andrew McCreight [:mccr8]

Reporter

Updated

•

8 years ago

Comment 14

•

8 years ago

I filed bug 1341354 for the residual leaks.

Franziskus Kiefer [:franziskus]

Updated

•

8 years ago

Attachment #8838442 - Flags: review?(ted) → review?(kaie)

Kai Engert [:KaiE:]

Comment 15

•

8 years ago

Comment on attachment 8838442 [details] [diff] [review] fix-nspr-leaks.patch I suspect this might have been knowingly allowed to leak in the past. Looking at _PR_DestroyThreadPrivate, it attempts to loop through TPD multiple times, and assumes that some items could have unpredictable side effects, and maybe even the TPD items could be set again, after the loop. If there's anything pending that couldn't get destroyed, then destroying _pr_tpd_destructors could potentially result in a new shutdown crash, right? If you want to try getting this cleaned up, please set the variables to NULL after freeing them (which the existing code already does for most other variables).

Kai Engert [:KaiE:]

Comment 16

•

8 years ago

Additional comment: You are using a mix of free and PR_Free. I checked if it matches the allocation. $ grep -e "->name *=" pr/src/linking/prlink.c This shows that it always gets allocated using strdup. Please use "free" to destroy pr_exe_loadmap->name, not PR_Free. pr_exe_loadmap is allocated using PR_NEWZAP, which uses PR_Calloc. Using PR_Free to destroy it is correct. _pr_tpd_destructors is allocated using PR_CALLOC. Please use PR_Free to destroy it, not "free".

Kai Engert [:KaiE:]

Updated

•

8 years ago

Attachment #8838442 - Flags: review?(kaie) → review-

Kai Engert [:KaiE:]

Comment 17

•

8 years ago

I researched when this code was added, it was more than 15 years ago: https://hg.mozilla.org/projects/nspr/rev/9640a6cb2c22 Function _PR_ShutdownLinker is still in that original state.

Franziskus Kiefer [:franziskus]

Updated

•

8 years ago

Updated

•

8 years ago

Assignee: franziskuskiefer → nobody

BMO Automation

Updated

•

3 years ago

Severity: normal → S3

Lars Eggert [:lars]

Updated

•

10 months ago

Blocks: nss-fuzzing-bugs

You need to log in before you can comment on or make changes to this bug.