Closed Bug 1414318 Opened 7 years ago Closed 7 years ago

Permanent Gecko 58 "valgrind-test | 37 bytes in 1 blocks are definitely lost at malloc / moz_xmalloc / ToNewUTF8String / mozilla::nsTerminator::StartWriter" since the uplift to Beta

Categories

(Toolkit :: Crash Reporting, defect, P1)

defect

Tracking

()

VERIFIED FIXED
mozilla59
Tracking Status
firefox-esr52 --- unaffected
firefox56 --- unaffected
firefox57 --- unaffected
firefox58 + verified
firefox59 --- verified

People

(Reporter: intermittent-bug-filer, Assigned: erahm)

References

Details

(Keywords: intermittent-failure, memory-leak)

Attachments

(1 file, 3 obsolete files)

Filed by: rvandermeulen [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=141966445&repo=mozilla-beta

https://queue.taskcluster.net/v1/task/cjw2S5FmTjaikq0KgJ2GNA/runs/0/artifacts/public/logs/live_backing.log

Annoyingly, these don't happen on Try for reasons I won't claim to understand.
Though maybe this is fallout from bug 870698 given the ToNewUTF8String leak? Also CCing some other people who touched nsTerminator.cpp during the Fx58 nightly cycle in case they see something obvious. Please prioritize this, however, since it's a permanent failure on Beta at the moment.

And again, I'm really sorry that this doesn't reproduce on Try. I really don't know why.
Severity: normal → critical
Flags: needinfo?(cpeterson)
Keywords: mlk
Priority: P5 → P1
The string there is intentionally leaked. Maybe the Valgrind whitelist needs to be fixed?
It looks like the Valgrind suppression file is here (and other similar files):
  build/valgrind/cross-architecture.sup
However, I don't see anything related to this intentional leak.
Ok, so the leak is actually this string, I think:
  ToNewUTF8String(path));
Which ends up being owned by a thread that is intentionally leaked?
There are two "#if !defined(DEBUG)" in that terminator file that should really be "#if !defined(NS_FREE_PERMANENT_DATA)" but that doesn't explain why this is only happening on beta.
Eric, do you think some string changes in 58 could have caused this? I can't figure out why we'd start leaking this.
Flags: needinfo?(erahm)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #1)
> Though maybe this is fallout from bug 870698 given the ToNewUTF8String leak?
> Also CCing some other people who touched nsTerminator.cpp during the Fx58
> nightly cycle in case they see something obvious. Please prioritize this,
> however, since it's a permanent failure on Beta at the moment.

Bug 870698 should not have caused any memory leaks. It just changed some functions to avoid calling strlen() when copying strings.

From Andrew's comments, it sounds like the leaked string is owned by a purposely-leaked RunWriter thread with the following comment:

  MOZ_LSAN_INTENTIONALLY_LEAK_OBJECT(arg);
  // Shutdown will generally complete before we have a chance to
  // deallocate. This is not a leak.

https://searchfox.org/mozilla-central/source/toolkit/components/terminator/nsTerminator.cpp#225-227
Flags: needinfo?(cpeterson)
(In reply to Andrew McCreight [:mccr8] from comment #6)
> Eric, do you think some string changes in 58 could have caused this? I can't
> figure out why we'd start leaking this.

My only guess would be a signature change for the string classes, but as you noted I don't see any ToNewUTF8String suppression. njn has been goofing with atoms but I don't think that relates to this code. My only guess is that the terminator stuff somehow doesn't get hit in Nightly during testing (although locally I can confirm it does).

Regardless we could do your fix (NS_FREE_PERMANENT_DATA) or just free the arg in RunWriter after copying it to the nsCString [1].

[1] http://searchfox.org/mozilla-central/rev/af86a58b157fbed26b0e86fcd81f1b421e80e60a/toolkit/components/terminator/nsTerminator.cpp#231
Flags: needinfo?(erahm)
Assignee: nobody → erahm
I think this is good enough.
Attachment #8925151 - Flags: feedback?(continuation)
Comment on attachment 8925151 [details] [diff] [review]
Turn of |StartWriter| in leak checking builds

Ryan can you test this on your beta branch?
Attachment #8925151 - Flags: feedback?(ryanvm)
And a try build testing V, asan, and a random mochitest:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=64a891d0e89029166d5bc0f29f48302cb7db85b1
Comment on attachment 8925151 [details] [diff] [review]
Turn of |StartWriter| in leak checking builds

Review of attachment 8925151 [details] [diff] [review]:
-----------------------------------------------------------------

nit: "Turn of" should be "Turn off". Otherwise, looks okay.
Attachment #8925151 - Flags: feedback?(continuation) → feedback+
Updated to actually build, can you test this Ryan?
Attachment #8925157 - Flags: feedback?(ryanvm)
Attachment #8925151 - Attachment is obsolete: true
Attachment #8925151 - Flags: feedback?(ryanvm)
Comment on attachment 8925157 [details] [diff] [review]
Turn off |StartWriter| in leak checking builds

I've never been able to reproduce these on Try, so I don't think we can do anything other than land it and see what happens.
Attachment #8925157 - Flags: feedback?(ryanvm)
Comment on attachment 8925157 [details] [diff] [review]
Turn off |StartWriter| in leak checking builds

Andrew can you give this an official r+? I'll request beta approval once it's on central.
Attachment #8925157 - Flags: review?(continuation)
Attachment #8925157 - Flags: review?(continuation) → review+
We're actually in that fun part of the cycle where both m-c and m-b are tracking Fx58 (with regular syncing betweeen the two). So there's nothing you more you need to do here :)
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/83f0e033450d
Turn off |StartWriter| in leak checking builds. r=mccr8
https://hg.mozilla.org/mozilla-central/rev/83f0e033450d
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Backed out for failing xpcshell's toolkit/components/terminator/tests/xpcshell/test_terminator_record.js on Linux x64 asan:

https://hg.mozilla.org/mozilla-central/rev/e437203962cc7d97b1646d09ed3061272a1df443

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=83f0e033450d1dce9a9ce96b37b465ae50dacdb6&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=usercancel&filter-resultStatus=runnable&filter-resultStatus=retry
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=142156398&repo=mozilla-inbound

[task 2017-11-04T14:53:20.622Z] 14:53:20     INFO -  TEST-START | toolkit/components/terminator/tests/xpcshell/test_terminator_record.js
[task 2017-11-04T14:54:25.560Z] 14:54:25  WARNING -  TEST-UNEXPECTED-FAIL | toolkit/components/terminator/tests/xpcshell/test_terminator_record.js | xpcshell return code: 1
[task 2017-11-04T14:54:25.565Z] 14:54:25     INFO -  TEST-INFO took 64938ms
[task 2017-11-04T14:54:25.567Z] 14:54:25     INFO -  >>>>>>>
[task 2017-11-04T14:54:25.569Z] 14:54:25     INFO -  (xpcshell/head.js) | test MAIN run_test pending (1)
[task 2017-11-04T14:54:25.571Z] 14:54:25     INFO -  (xpcshell/head.js) | test run_next_test 0 pending (2)
[task 2017-11-04T14:54:25.573Z] 14:54:25     INFO -  (xpcshell/head.js) | test MAIN run_test finished (2)
[task 2017-11-04T14:54:25.575Z] 14:54:25     INFO -  running event loop
[task 2017-11-04T14:54:25.577Z] 14:54:25     INFO -  "CONSOLE_MESSAGE: (info) No chrome package registered for chrome://branding/locale/brand.properties"
[task 2017-11-04T14:54:25.578Z] 14:54:25     INFO -  toolkit/components/terminator/tests/xpcshell/test_terminator_record.js | Starting init
[task 2017-11-04T14:54:25.580Z] 14:54:25     INFO -  (xpcshell/head.js) | test init pending (2)
[task 2017-11-04T14:54:25.581Z] 14:54:25     INFO -  "Initializing the Terminator"
[task 2017-11-04T14:54:25.583Z] 14:54:25     INFO -  (xpcshell/head.js) | test run_next_test 0 finished (2)
[task 2017-11-04T14:54:25.584Z] 14:54:25     INFO -  (xpcshell/head.js) | test run_next_test 1 pending (2)
[task 2017-11-04T14:54:25.585Z] 14:54:25     INFO -  (xpcshell/head.js) | test init finished (2)
[task 2017-11-04T14:54:25.586Z] 14:54:25     INFO -  toolkit/components/terminator/tests/xpcshell/test_terminator_record.js | Starting test_record
[task 2017-11-04T14:54:25.587Z] 14:54:25     INFO -  (xpcshell/head.js) | test test_record pending (2)
[task 2017-11-04T14:54:25.589Z] 14:54:25     INFO -  "Starting shutdown"
[task 2017-11-04T14:54:25.590Z] 14:54:25     INFO -  "Moving to next phase"
[task 2017-11-04T14:54:25.591Z] 14:54:25     INFO -  "Waiting for file creation: /tmp/xpc-profile-cnLMgl/ShutdownDuration.json"
[task 2017-11-04T14:54:25.593Z] 14:54:25     INFO -  (xpcshell/head.js) | test run_next_test 1 finished (2)
[task 2017-11-04T14:54:25.594Z] 14:54:25     INFO -  "CONSOLE_MESSAGE: (info) No chrome package registered for chrome://branding/locale/brand.properties"
[task 2017-11-04T14:54:25.595Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.597Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.598Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.599Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.601Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.602Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.604Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.605Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.606Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.607Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.609Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.610Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.611Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.612Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.612Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.613Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.614Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.615Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.615Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.615Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.616Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.616Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.617Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.617Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.618Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.618Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.619Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.620Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.622Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.622Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.623Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.623Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.624Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.624Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.625Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.625Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.626Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.627Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.628Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.628Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.630Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.631Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.631Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.632Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.633Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.633Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.634Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.635Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.635Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.635Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.636Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.637Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.638Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.638Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.638Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.639Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.639Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.640Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.641Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.642Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.642Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.642Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.644Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.644Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.645Z] 14:54:25     INFO -  PID 12804 | ASAN:DEADLYSIGNAL
[task 2017-11-04T14:54:25.645Z] 14:54:25     INFO -  PID 12804 | =================================================================
[task 2017-11-04T14:54:25.645Z] 14:54:25    ERROR -  PID 12804 | ==12804==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f71e2b914de bp 0x7f717ffee730 sp 0x7f717ffee720 T17)
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 | ==12804==The signal is caused by a WRITE memory access.
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 | ==12804==Hint: address points to the zero page.
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 |     #0 0x7f71e2b914dd in mozilla::(anonymous namespace)::RunWatchdog(void*) /builds/worker/workspace/build/src/toolkit/components/terminator/nsTerminator.cpp:163:5
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 |     #1 0x7f71e89184d3 in _pt_root /builds/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:216:5
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 |     #2 0x7f71e8b486b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 |     #3 0x7f71d2d963dc in clone /build/glibc-bfm8X4/glibc-2.23/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Status: RESOLVED → REOPENED
Flags: needinfo?(erahm)
Resolution: FIXED → ---
Target Milestone: mozilla58 → ---
Updated to disable the nsTerminator test on asan instead of debug
Attachment #8925693 - Flags: review?(continuation)
Attachment #8925693 - Flags: review?(continuation) → review+
Attachment #8925157 - Attachment is obsolete: true
Flags: needinfo?(erahm)
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b1ce6a65a214
Turn off |StartWriter| in leak checking builds. r=mccr8
Keywords: checkin-needed
Any news here, Eric? :)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #26)
> Any news here, Eric? :)

I'm not sure what to do at this point, I'd be okay with just disabling it in debug again...I'll see if I can repro locally and figure out what's going on.
Attachment #8930613 - Flags: review?(continuation)
Attachment #8925693 - Attachment is obsolete: true
Attachment #8930613 - Flags: review?(continuation) → review+
Alright so the deal is NS_FREE_PERMANENT_DATA is defined in debug builds if NS_BUILD_REFCNT_LOGGING is defined [1], and NS_BUILD_REFCNT_LOGGING is defined if DEBUG is defined [2].

Technically I think things would work okay if we changed the ifdef to:

> #ifdef defined(MOZ_VALGRIND) || defined(MOZ_ASAN)

but we've been chasing this down a while so the easiest option is to just disable the test in debug and not worry about future issues coming up that would trigger issues with refcount logging.

[1] https://searchfox.org/mozilla-central/rev/919dce54f43356c22d6ff6b81c07ef412b1bf933/xpcom/base/nscore.h#169-171
[2] https://searchfox.org/mozilla-central/rev/919dce54f43356c22d6ff6b81c07ef412b1bf933/xpcom/base/nscore.h#152-157
Flags: needinfo?(erahm)
Pushed by erahm@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/90a3dbed251b
Turn off |StartWriter| in leak checking builds. r=mccr8
https://hg.mozilla.org/mozilla-central/rev/90a3dbed251b
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
And verified green on Beta :)
Status: RESOLVED → VERIFIED
(In reply to (PTO - Back 27-Nov) Ryan VanderMeulen [:RyanVM] from comment #34)
> And verified green on Beta :)

\o/
See Also: → 1255484
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: