Permanent Gecko 58 "valgrind-test | 37 bytes in 1 blocks are definitely lost at malloc / moz_xmalloc / ToNewUTF8String / mozilla::nsTerminator::StartWriter" since the uplift to Beta

VERIFIED FIXED in Firefox 58

Status

()

defect
P1
critical
VERIFIED FIXED
2 years ago
2 years ago

People

(Reporter: intermittent-bug-filer, Assigned: erahm)

Tracking

({intermittent-failure, memory-leak})

unspecified
mozilla59
Points:
---

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox56 unaffected, firefox57 unaffected, firefox58+ verified, firefox59 verified)

Details

Attachments

(1 attachment, 3 obsolete attachments)

Reporter

Description

2 years ago
treeherder
Filed by: rvandermeulen [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=141966445&repo=mozilla-beta

https://queue.taskcluster.net/v1/task/cjw2S5FmTjaikq0KgJ2GNA/runs/0/artifacts/public/logs/live_backing.log

Annoyingly, these don't happen on Try for reasons I won't claim to understand.
Though maybe this is fallout from bug 870698 given the ToNewUTF8String leak? Also CCing some other people who touched nsTerminator.cpp during the Fx58 nightly cycle in case they see something obvious. Please prioritize this, however, since it's a permanent failure on Beta at the moment.

And again, I'm really sorry that this doesn't reproduce on Try. I really don't know why.
Severity: normal → critical
Flags: needinfo?(cpeterson)
Keywords: mlk
Priority: P5 → P1
The string there is intentionally leaked. Maybe the Valgrind whitelist needs to be fixed?
It looks like the Valgrind suppression file is here (and other similar files):
  build/valgrind/cross-architecture.sup
However, I don't see anything related to this intentional leak.
Ok, so the leak is actually this string, I think:
  ToNewUTF8String(path));
Which ends up being owned by a thread that is intentionally leaked?
There are two "#if !defined(DEBUG)" in that terminator file that should really be "#if !defined(NS_FREE_PERMANENT_DATA)" but that doesn't explain why this is only happening on beta.
Eric, do you think some string changes in 58 could have caused this? I can't figure out why we'd start leaking this.
Flags: needinfo?(erahm)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #1)
> Though maybe this is fallout from bug 870698 given the ToNewUTF8String leak?
> Also CCing some other people who touched nsTerminator.cpp during the Fx58
> nightly cycle in case they see something obvious. Please prioritize this,
> however, since it's a permanent failure on Beta at the moment.

Bug 870698 should not have caused any memory leaks. It just changed some functions to avoid calling strlen() when copying strings.

From Andrew's comments, it sounds like the leaked string is owned by a purposely-leaked RunWriter thread with the following comment:

  MOZ_LSAN_INTENTIONALLY_LEAK_OBJECT(arg);
  // Shutdown will generally complete before we have a chance to
  // deallocate. This is not a leak.

https://searchfox.org/mozilla-central/source/toolkit/components/terminator/nsTerminator.cpp#225-227
Flags: needinfo?(cpeterson)
Assignee

Comment 8

2 years ago
(In reply to Andrew McCreight [:mccr8] from comment #6)
> Eric, do you think some string changes in 58 could have caused this? I can't
> figure out why we'd start leaking this.

My only guess would be a signature change for the string classes, but as you noted I don't see any ToNewUTF8String suppression. njn has been goofing with atoms but I don't think that relates to this code. My only guess is that the terminator stuff somehow doesn't get hit in Nightly during testing (although locally I can confirm it does).

Regardless we could do your fix (NS_FREE_PERMANENT_DATA) or just free the arg in RunWriter after copying it to the nsCString [1].

[1] http://searchfox.org/mozilla-central/rev/af86a58b157fbed26b0e86fcd81f1b421e80e60a/toolkit/components/terminator/nsTerminator.cpp#231
Flags: needinfo?(erahm)
Assignee

Updated

2 years ago
Assignee: nobody → erahm
Assignee

Comment 9

2 years ago
I think this is good enough.
Attachment #8925151 - Flags: feedback?(continuation)
Assignee

Comment 10

2 years ago
Comment on attachment 8925151 [details] [diff] [review]
Turn of |StartWriter| in leak checking builds

Ryan can you test this on your beta branch?
Attachment #8925151 - Flags: feedback?(ryanvm)
Assignee

Comment 11

2 years ago
And a try build testing V, asan, and a random mochitest:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=64a891d0e89029166d5bc0f29f48302cb7db85b1
Comment on attachment 8925151 [details] [diff] [review]
Turn of |StartWriter| in leak checking builds

Review of attachment 8925151 [details] [diff] [review]:
-----------------------------------------------------------------

nit: "Turn of" should be "Turn off". Otherwise, looks okay.
Attachment #8925151 - Flags: feedback?(continuation) → feedback+
Assignee

Comment 13

2 years ago
Updated to actually build, can you test this Ryan?
Attachment #8925157 - Flags: feedback?(ryanvm)
Assignee

Updated

2 years ago
Attachment #8925151 - Attachment is obsolete: true
Attachment #8925151 - Flags: feedback?(ryanvm)
Comment on attachment 8925157 [details] [diff] [review]
Turn off |StartWriter| in leak checking builds

I've never been able to reproduce these on Try, so I don't think we can do anything other than land it and see what happens.
Attachment #8925157 - Flags: feedback?(ryanvm)
Assignee

Comment 15

2 years ago
Comment on attachment 8925157 [details] [diff] [review]
Turn off |StartWriter| in leak checking builds

Andrew can you give this an official r+? I'll request beta approval once it's on central.
Attachment #8925157 - Flags: review?(continuation)
Attachment #8925157 - Flags: review?(continuation) → review+
We're actually in that fun part of the cycle where both m-c and m-b are tracking Fx58 (with regular syncing betweeen the two). So there's nothing you more you need to do here :)

Comment 17

2 years ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/83f0e033450d
Turn off |StartWriter| in leak checking builds. r=mccr8
https://hg.mozilla.org/mozilla-central/rev/83f0e033450d
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Backed out for failing xpcshell's toolkit/components/terminator/tests/xpcshell/test_terminator_record.js on Linux x64 asan:

https://hg.mozilla.org/mozilla-central/rev/e437203962cc7d97b1646d09ed3061272a1df443

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=83f0e033450d1dce9a9ce96b37b465ae50dacdb6&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=usercancel&filter-resultStatus=runnable&filter-resultStatus=retry
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=142156398&repo=mozilla-inbound

[task 2017-11-04T14:53:20.622Z] 14:53:20     INFO -  TEST-START | toolkit/components/terminator/tests/xpcshell/test_terminator_record.js
[task 2017-11-04T14:54:25.560Z] 14:54:25  WARNING -  TEST-UNEXPECTED-FAIL | toolkit/components/terminator/tests/xpcshell/test_terminator_record.js | xpcshell return code: 1
[task 2017-11-04T14:54:25.565Z] 14:54:25     INFO -  TEST-INFO took 64938ms
[task 2017-11-04T14:54:25.567Z] 14:54:25     INFO -  >>>>>>>
[task 2017-11-04T14:54:25.569Z] 14:54:25     INFO -  (xpcshell/head.js) | test MAIN run_test pending (1)
[task 2017-11-04T14:54:25.571Z] 14:54:25     INFO -  (xpcshell/head.js) | test run_next_test 0 pending (2)
[task 2017-11-04T14:54:25.573Z] 14:54:25     INFO -  (xpcshell/head.js) | test MAIN run_test finished (2)
[task 2017-11-04T14:54:25.575Z] 14:54:25     INFO -  running event loop
[task 2017-11-04T14:54:25.577Z] 14:54:25     INFO -  "CONSOLE_MESSAGE: (info) No chrome package registered for chrome://branding/locale/brand.properties"
[task 2017-11-04T14:54:25.578Z] 14:54:25     INFO -  toolkit/components/terminator/tests/xpcshell/test_terminator_record.js | Starting init
[task 2017-11-04T14:54:25.580Z] 14:54:25     INFO -  (xpcshell/head.js) | test init pending (2)
[task 2017-11-04T14:54:25.581Z] 14:54:25     INFO -  "Initializing the Terminator"
[task 2017-11-04T14:54:25.583Z] 14:54:25     INFO -  (xpcshell/head.js) | test run_next_test 0 finished (2)
[task 2017-11-04T14:54:25.584Z] 14:54:25     INFO -  (xpcshell/head.js) | test run_next_test 1 pending (2)
[task 2017-11-04T14:54:25.585Z] 14:54:25     INFO -  (xpcshell/head.js) | test init finished (2)
[task 2017-11-04T14:54:25.586Z] 14:54:25     INFO -  toolkit/components/terminator/tests/xpcshell/test_terminator_record.js | Starting test_record
[task 2017-11-04T14:54:25.587Z] 14:54:25     INFO -  (xpcshell/head.js) | test test_record pending (2)
[task 2017-11-04T14:54:25.589Z] 14:54:25     INFO -  "Starting shutdown"
[task 2017-11-04T14:54:25.590Z] 14:54:25     INFO -  "Moving to next phase"
[task 2017-11-04T14:54:25.591Z] 14:54:25     INFO -  "Waiting for file creation: /tmp/xpc-profile-cnLMgl/ShutdownDuration.json"
[task 2017-11-04T14:54:25.593Z] 14:54:25     INFO -  (xpcshell/head.js) | test run_next_test 1 finished (2)
[task 2017-11-04T14:54:25.594Z] 14:54:25     INFO -  "CONSOLE_MESSAGE: (info) No chrome package registered for chrome://branding/locale/brand.properties"
[task 2017-11-04T14:54:25.595Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.597Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.598Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.599Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.601Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.602Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.604Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.605Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.606Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.607Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.609Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.610Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.611Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.612Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.612Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.613Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.614Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.615Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.615Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.615Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.616Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.616Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.617Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.617Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.618Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.618Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.619Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.620Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.622Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.622Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.623Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.623Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.624Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.624Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.625Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.625Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.626Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.627Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.628Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.628Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.630Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.631Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.631Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.632Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.633Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.633Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.634Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.635Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.635Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.635Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.636Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.637Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.638Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.638Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.638Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.639Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.639Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.640Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.641Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.642Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.642Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.642Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.644Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.644Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.645Z] 14:54:25     INFO -  PID 12804 | ASAN:DEADLYSIGNAL
[task 2017-11-04T14:54:25.645Z] 14:54:25     INFO -  PID 12804 | =================================================================
[task 2017-11-04T14:54:25.645Z] 14:54:25    ERROR -  PID 12804 | ==12804==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f71e2b914de bp 0x7f717ffee730 sp 0x7f717ffee720 T17)
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 | ==12804==The signal is caused by a WRITE memory access.
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 | ==12804==Hint: address points to the zero page.
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 |     #0 0x7f71e2b914dd in mozilla::(anonymous namespace)::RunWatchdog(void*) /builds/worker/workspace/build/src/toolkit/components/terminator/nsTerminator.cpp:163:5
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 |     #1 0x7f71e89184d3 in _pt_root /builds/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:216:5
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 |     #2 0x7f71e8b486b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  "The file does not exist yet. Waiting 1 second."
[task 2017-11-04T14:54:25.646Z] 14:54:25     INFO -  PID 12804 |     #3 0x7f71d2d963dc in clone /build/glibc-bfm8X4/glibc-2.23/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Status: RESOLVED → REOPENED
Flags: needinfo?(erahm)
Resolution: FIXED → ---
Target Milestone: mozilla58 → ---
Comment hidden (Intermittent Failures Robot)
Assignee

Comment 21

2 years ago
Updated to disable the nsTerminator test on asan instead of debug
Attachment #8925693 - Flags: review?(continuation)
Attachment #8925693 - Flags: review?(continuation) → review+
Assignee

Updated

2 years ago
Attachment #8925157 - Attachment is obsolete: true
Flags: needinfo?(erahm)
Assignee

Updated

2 years ago
Keywords: checkin-needed

Comment 22

2 years ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b1ce6a65a214
Turn off |StartWriter| in leak checking builds. r=mccr8
Keywords: checkin-needed
Comment hidden (Intermittent Failures Robot)
Assignee

Updated

2 years ago
Duplicate of this bug: 1416596
Any news here, Eric? :)
Assignee

Comment 27

2 years ago
(In reply to Ryan VanderMeulen [:RyanVM] from comment #26)
> Any news here, Eric? :)

I'm not sure what to do at this point, I'd be okay with just disabling it in debug again...I'll see if I can repro locally and figure out what's going on.
Comment hidden (Intermittent Failures Robot)
Assignee

Comment 29

2 years ago
Attachment #8930613 - Flags: review?(continuation)
Assignee

Updated

2 years ago
Attachment #8925693 - Attachment is obsolete: true
Attachment #8930613 - Flags: review?(continuation) → review+
Assignee

Comment 30

2 years ago
Alright so the deal is NS_FREE_PERMANENT_DATA is defined in debug builds if NS_BUILD_REFCNT_LOGGING is defined [1], and NS_BUILD_REFCNT_LOGGING is defined if DEBUG is defined [2].

Technically I think things would work okay if we changed the ifdef to:

> #ifdef defined(MOZ_VALGRIND) || defined(MOZ_ASAN)

but we've been chasing this down a while so the easiest option is to just disable the test in debug and not worry about future issues coming up that would trigger issues with refcount logging.

[1] https://searchfox.org/mozilla-central/rev/919dce54f43356c22d6ff6b81c07ef412b1bf933/xpcom/base/nscore.h#169-171
[2] https://searchfox.org/mozilla-central/rev/919dce54f43356c22d6ff6b81c07ef412b1bf933/xpcom/base/nscore.h#152-157
Assignee

Updated

2 years ago
Flags: needinfo?(erahm)

Comment 31

2 years ago
Pushed by erahm@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/90a3dbed251b
Turn off |StartWriter| in leak checking builds. r=mccr8

Comment 32

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/90a3dbed251b
Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
And verified green on Beta :)
Status: RESOLVED → VERIFIED
Assignee

Comment 35

2 years ago
(In reply to (PTO - Back 27-Nov) Ryan VanderMeulen [:RyanVM] from comment #34)
> And verified green on Beta :)

\o/
Comment hidden (Intermittent Failures Robot)
You need to log in before you can comment on or make changes to this bug.