Permanent Gecko 58 "valgrind-test | 37 bytes in 1 blocks are definitely lost at malloc / moz_xmalloc / ToNewUTF8String / mozilla::nsTerminator::StartWriter" since the uplift to Beta

REOPENED
Assigned to

Status

()

Toolkit
Crash Reporting
P1
critical
REOPENED
16 days ago
2 days ago

People

(Reporter: Treeherder Bug Filer, Assigned: erahm, NeedInfo)

Tracking

({intermittent-failure, mlk})

unspecified
intermittent-failure, mlk
Points:
---

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox56 unaffected, firefox57 unaffected, firefox58+ affected)

Details

Attachments

(1 attachment, 2 obsolete attachments)

(Reporter)

Description

16 days ago
treeherder
Filed by: rvandermeulen [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=141966445&repo=mozilla-beta

https://queue.taskcluster.net/v1/task/cjw2S5FmTjaikq0KgJ2GNA/runs/0/artifacts/public/logs/live_backing.log

Annoyingly, these don't happen on Try for reasons I won't claim to understand.
Though maybe this is fallout from bug 870698 given the ToNewUTF8String leak? Also CCing some other people who touched nsTerminator.cpp during the Fx58 nightly cycle in case they see something obvious. Please prioritize this, however, since it's a permanent failure on Beta at the moment.

And again, I'm really sorry that this doesn't reproduce on Try. I really don't know why.
Severity: normal → critical
status-firefox56: --- → unaffected
status-firefox57: --- → unaffected
status-firefox58: --- → affected
status-firefox-esr52: --- → unaffected
tracking-firefox58: --- → ?
Flags: needinfo?(cpeterson)
Keywords: mlk
Priority: P5 → P1
The string there is intentionally leaked. Maybe the Valgrind whitelist needs to be fixed?
It looks like the Valgrind suppression file is here (and other similar files):
  build/valgrind/cross-architecture.sup
However, I don't see anything related to this intentional leak.
Ok, so the leak is actually this string, I think:
  ToNewUTF8String(path));
Which ends up being owned by a thread that is intentionally leaked?
There are two "#if !defined(DEBUG)" in that terminator file that should really be "#if !defined(NS_FREE_PERMANENT_DATA)" but that doesn't explain why this is only happening on beta.
Eric, do you think some string changes in 58 could have caused this? I can't figure out why we'd start leaking this.
Flags: needinfo?(erahm)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #1)
> Though maybe this is fallout from bug 870698 given the ToNewUTF8String leak?
> Also CCing some other people who touched nsTerminator.cpp during the Fx58
> nightly cycle in case they see something obvious. Please prioritize this,
> however, since it's a permanent failure on Beta at the moment.

Bug 870698 should not have caused any memory leaks. It just changed some functions to avoid calling strlen() when copying strings.

From Andrew's comments, it sounds like the leaked string is owned by a purposely-leaked RunWriter thread with the following comment:

  MOZ_LSAN_INTENTIONALLY_LEAK_OBJECT(arg);
  // Shutdown will generally complete before we have a chance to
  // deallocate. This is not a leak.

https://searchfox.org/mozilla-central/source/toolkit/components/terminator/nsTerminator.cpp#225-227
Flags: needinfo?(cpeterson)
(In reply to Andrew McCreight [:mccr8] from comment #6)
> Eric, do you think some string changes in 58 could have caused this? I can't
> figure out why we'd start leaking this.

My only guess would be a signature change for the string classes, but as you noted I don't see any ToNewUTF8String suppression. njn has been goofing with atoms but I don't think that relates to this code. My only guess is that the terminator stuff somehow doesn't get hit in Nightly during testing (although locally I can confirm it does).

Regardless we could do your fix (NS_FREE_PERMANENT_DATA) or just free the arg in RunWriter after copying it to the nsCString [1].

[1] http://searchfox.org/mozilla-central/rev/af86a58b157fbed26b0e86fcd81f1b421e80e60a/toolkit/components/terminator/nsTerminator.cpp#231
Flags: needinfo?(erahm)
Assignee: nobody → erahm
Created attachment 8925151 [details] [diff] [review]
Turn of |StartWriter| in leak checking builds

I think this is good enough.
Attachment #8925151 - Flags: feedback?(continuation)
Comment on attachment 8925151 [details] [diff] [review]
Turn of |StartWriter| in leak checking builds

Ryan can you test this on your beta branch?
Attachment #8925151 - Flags: feedback?(ryanvm)
And a try build testing V, asan, and a random mochitest:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=64a891d0e89029166d5bc0f29f48302cb7db85b1
Comment on attachment 8925151 [details] [diff] [review]
Turn of |StartWriter| in leak checking builds

Review of attachment 8925151 [details] [diff] [review]:
-----------------------------------------------------------------

nit: "Turn of" should be "Turn off". Otherwise, looks okay.
Attachment #8925151 - Flags: feedback?(continuation) → feedback+
Created attachment 8925157 [details] [diff] [review]
Turn off |StartWriter| in leak checking builds

Updated to actually build, can you test this Ryan?
Attachment #8925157 - Flags: feedback?(ryanvm)
Attachment #8925151 - Attachment is obsolete: true
Attachment #8925151 - Flags: feedback?(ryanvm)
Comment on attachment 8925157 [details] [diff] [review]
Turn off |StartWriter| in leak checking builds

I've never been able to reproduce these on Try, so I don't think we can do anything other than land it and see what happens.
Attachment #8925157 - Flags: feedback?(ryanvm)
Comment on attachment 8925157 [details] [diff] [review]
Turn off |StartWriter| in leak checking builds

Andrew can you give this an official r+? I'll request beta approval once it's on central.
Attachment #8925157 - Flags: review?(continuation)
Attachment #8925157 - Flags: review?(continuation) → review+
We're actually in that fun part of the cycle where both m-c and m-b are tracking Fx58 (with regular syncing betweeen the two). So there's nothing you more you need to do here :)

Comment 17

16 days ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/83f0e033450d
Turn off |StartWriter| in leak checking builds. r=mccr8
https://hg.mozilla.org/mozilla-central/rev/83f0e033450d
Status: NEW → RESOLVED
Last Resolved: 16 days ago
status-firefox58: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Backed out for failing xpcshell's toolkit/components/terminator/tests/xpcshell/test_terminator_record.js on Linux x64 asan:

https://hg.mozilla.org/mozilla-central/rev/e437203962cc7d97b1646d09ed3061272a1df443

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=83f0e033450d1dce9a9ce96b37b465ae50dacdb6&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=usercancel&filter-resultStatus=runnable&filter-resultStatus=retry
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=142156398&repo=mozilla-inbound

[task 2017-11-04T14:53:20.622Z] 14:53:20     INFO -  TEST-START | toolkit/components/terminator/tests/xpcshell/test_terminator_record.js
[task 2017-11-04T14:54:25.560Z] 14:54:25  WARNING -  TEST-UNEXPECTED-FAIL | toolkit/components/terminator/tests/xpcshell/test_terminator_record.js | xpcshell return code: 1
[task 2017-11-04T14:54:25.565Z] 14:54:25     INFO -  TEST-INFO took 64938ms
[task 2017-11-04T14:54:25.567Z] 14:54:25     INFO -  >>>>>>>
[task 2017-11-04T14:54:25.569Z] 14:54:25     INFO -  (xpcshell/head.js) | test MAIN run_test pending (1)
[task 2017-11-04T14:54:25.571Z] 14:54:25     INFO -  (xpcshell/head.js) | test run_next_test 0 pending (2)
[task 2017-11-04T14:54:25.573Z] 14:54:25     INFO -  (xpcshell/head.js) | test MAIN run_test finished (2)
[task 2017-11-04T14:54:25.575Z] 14:54:25     INFO -  running event loop
[task 2017-11-04T14:54:25.577Z] 14:54:25     INFO -  "CONSOLE_MESSAGE: (info) No chrome package registered for chrome://branding/locale/brand.properties"
[task 2017-11-04T14:54:25.578Z] 14:54:25     INFO -  toolkit/components/terminator/tests/xpcshell/test_terminator_record.js | Starting init
[task 2017-11-04T14:54:25.580Z] 14:54:25     INFO -  (xpcshell/head.js) | test init pending (2)
[task 2017-11-04T14:54:25.581Z] 14:54:25     INFO -  "Initializing the Terminator"
[task 2017-11-04T14:54:25.583Z] 14:54:25     INFO -  (xpcshell/head.js) | test run_next_test 0 finished (2)
[task 2017-11-04T14:54:25.584Z] 14:54:25     INFO -  (xpcshell/head.js) | test run_next_test 1 pending (2)
[task 2017-11-04T14:54:25.585Z] 14:54:25     INFO -  (xpcshell/head.js) | test init finished (2)
[task 2017-11-04T14:54:25.586Z] 14:54:25     INFO -  toolkit/components/terminator/tests/xpcshell/test_terminator_record.js | Starting test_record
[task 2017-11-04T14:54:25.587Z] 14:54:25     INFO -  (xpcshell/head.js) | test test_record pending (2)
[task 2017-11-04T14:54:25.589Z] 14:54:25     INFO -  "Starting shutdown"
[task 2017-11-04T14:54:25.590Z] 14:54:25     INFO -  "Moving to next phase"
[task 2017-11-04T14:54:25.591Z] 14:54:25     INFO -  "Waiting for file creation: /tmp/xpc-profile-cnLMgl/ShutdownDuration.json"
[task 2017-11-04T14:54:25.593Z] 14:54:25     INFO -  (xpcshell/head.js) | test run_next_test 1 finished (2)
[task 2017-11-04T14:54:25.594Z] 14:54:25     INFO -  "CONSOLE_MESSAGE: (info) No chrome package registered for chrome://branding/locale/brand.properties"
[task 2017-11-04T14:54:25.595Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.597Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.598Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.599Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.601Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.602Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.604Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.605Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.606Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.607Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.609Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.610Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.611Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.612Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.612Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.613Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.614Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.615Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.615Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.615Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.616Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.616Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.617Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.617Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.618Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.618Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.619Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.620Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.622Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.622Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.623Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.623Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.624Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.624Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.625Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.625Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.626Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.627Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.628Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.628Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.630Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.631Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.631Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.632Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.633Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.633Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.634Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.635Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.635Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.635Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.636Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.637Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.638Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.638Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.638Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.639Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.639Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.640Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.641Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.642Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.642Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.642Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.644Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.644Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.645Z] 14:54:25     INFO -  PID 12804 | ASAN:DEADLYSIGNAL
[task 2017-11-04T14:54:25.645Z] 14:54:25     INFO -  PID 12804 | =================================================================
[task 2017-11-04T14:54:25.645Z] 14:54:25    ERROR -  PID 12804 | ==12804==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f71e2b914de bp 0x7f717ffee730 sp 0x7f717ffee720 T17)
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 | ==12804==The signal is caused by a WRITE memory access.
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 | ==12804==Hint: address points to the zero page.
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 |     #0 0x7f71e2b914dd in mozilla::(anonymous namespace)::RunWatchdog(void*) /builds/worker/workspace/build/src/toolkit/components/terminator/nsTerminator.cpp:163:5
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 |     #1 0x7f71e89184d3 in _pt_root /builds/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:216:5
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 |     #2 0x7f71e8b486b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 |     #3 0x7f71d2d963dc in clone /build/glibc-bfm8X4/glibc-2.23/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Status: RESOLVED → REOPENED
status-firefox58: fixed → affected
Flags: needinfo?(erahm)
Resolution: FIXED → ---
Target Milestone: mozilla58 → ---

Comment 20

14 days ago
9 failures in 857 pushes (0.011 failures/push) were associated with this bug in the last 7 days.    

Repository breakdown:
* mozilla-beta: 9

Platform breakdown:
* linux64: 9

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1414318&startday=2017-10-30&endday=2017-11-05&tree=all
tracking-firefox58: ? → +
Created attachment 8925693 [details] [diff] [review]
Turn off |StartWriter| in leak checking builds

Updated to disable the nsTerminator test on asan instead of debug
Attachment #8925693 - Flags: review?(continuation)
Attachment #8925693 - Flags: review?(continuation) → review+
Attachment #8925157 - Attachment is obsolete: true
Flags: needinfo?(erahm)
Keywords: checkin-needed

Comment 22

13 days ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b1ce6a65a214
Turn off |StartWriter| in leak checking builds. r=mccr8
Keywords: checkin-needed
Backed out changeset b1ce6a65a214 (bug 1414318) for failing xpcshell on toolkit/components/terminator/tests/xpcshell/test_terminator_record.js r=backout on a CLOSED TREE

https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=b1ce6a65a2141c6ad7f8b3c83333878d976f75c2

https://treeherder.mozilla.org/logviewer.html#?job_id=142575760&repo=mozilla-inbound&lineNumber=2490

https://hg.mozilla.org/integration/mozilla-inbound/rev/0418d4d4a91ef989fb18d56a1b30e2a6e413cec5
Flags: needinfo?(erahm)
5 failures in 849 pushes (0.006 failures/push) were associated with this bug in the last 7 days.    

Repository breakdown:
* mozilla-beta: 5

Platform breakdown:
* linux64: 5

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1414318&startday=2017-11-06&endday=2017-11-12&tree=all
Duplicate of this bug: 1416596
Any news here, Eric? :)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #26)
> Any news here, Eric? :)

I'm not sure what to do at this point, I'd be okay with just disabling it in debug again...I'll see if I can repro locally and figure out what's going on.
You need to log in before you can comment on or make changes to this bug.