Closed Bug 1407593 Opened 8 years ago Closed 6 years ago

Embedding leaks make it harder to assert shutdown correctness

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla70

Tracking Flags:

Tracking

Status

firefox70

---

fixed

People

(Reporter: jonco, Assigned: jonco)

References

Details

Attachments

(1 file, 1 obsolete file)

bug1407593-check-js-leaks 8 years ago Jon Coppeard (:jonco) 30.50 KB, patch	mccr8 : feedback+	Details \| Diff \| Splinter Review
Bug 1407593 - Report embedding leaks of JS GC things r=mccr8,sfink? 6 years ago Jon Coppeard (:jonco) 47 bytes, text/x-phabricator-request		Details \| Review

Jon Coppeard (:jonco)

Assignee

Description

•

8 years ago

Sometimes the embedding still has live gray roots at shutdown because of a leak somewhere in the system. If this happens that we don't free everything and we get assertion failures for assertions that that everything is cleaned up at shutdown. This leads to frequent bug reports (e.g. bug 1407505, bug 1402473). The reason that we don't just clear all gray roots at shutdown like we do for internal roots is that this would cause some leaks to not get reported. However these assertion failures also hide the real issue. Ideally I think we should report leaks that involve JS sooner, and clear the roots on shutdown.

Paul Bone [:pbone]

Updated

•

8 years ago

Priority: -- → P3

Jon Coppeard (:jonco)

Assignee

Comment 1

•

8 years ago

Andrew, I think I've mentioned this to you before. Do you think this is feasible and if so what's the best way to proceed? I don't really know how our leak checking works at the moment.

Flags: needinfo?(continuation)

Jon Coppeard (:jonco)

Assignee

Updated

•

8 years ago

Blocks: 1369712

Andrew McCreight (out of office until 8/21) [:mccr8]

Comment 2

•

8 years ago

Here's how XPCOM leak checking works: - Classes opt into it by calling MOZ_COUNT_CTOR in their ctor and MOZ_COUNT_DTOR in their dtor, with a class name. (There's a slightly different mechanism for refcounted objects, but it doesn't matter for the present discussion.) nsTraceRefcnt.cpp keeps a table of the number of extant instances of each class. - Very late in shutdown, nsTraceRefcnt::Shutdown() dumps out the table of various classes it saw, how many it saw, and how many remain. There is one file per process, with the pid in the file name. - testing/mochitest/leaks.py is a script run by the test harness that checks to make sure there's a leak log for every process that we created, and it parses the table and prints out an error for every class where we had more than one instance remaining at shutdown. In theory, I think your proposal is reasonable. You'd have to make sure that the test harness registers it as an error if there's a leak, and you'd want it to be non-fatal, so we can do the rest of the checking we do at shut down. There's some loss from not giving information about what exactly leaked, but I think that in practice if you are leaking JS you're going to get a completely useless list of leaking objects anyways. In practice, you may need to debug a wide variety of "leaks" in your shutdown detector that we don't see with the XPCOM leak detector. The problem I am imagining is that some random place roots JS objects, but clears them out at some point in shutdown after when you want to add your check. But maybe it will be okay. Some of these leaks may be actual issues, so the work spent on debugging them won't be wasted, but some will likely be silly issues that you just have to fix to make your leak detector work.

Flags: needinfo?(continuation)

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

8 years ago

•

8 years ago

(In reply to Andrei Ciure[:andrei_ciure_] from comment #6) I'll work on this next.

Assignee: nobody → jcoppeard

Flags: needinfo?(jcoppeard)

Jon Coppeard (:jonco)

Assignee

Comment 8

•

8 years ago

Attached patch bug1407593-check-js-leaks (obsolete) — Details — Splinter Review

Here's what I have so far: XPCOM logs any roots left at shutdown in a similar way to the existing leak tracing and then clears them. I extended the existing leak checking scripts to process these logs too, again in a very similar way. The JS engine now asserts that there are no roots left at shutdown. To test this I wrote a patch to force a leak of a JSString (via a Value in a ModuleScript), which produced the following output: TEST-INFO | Main app process: exit 0 runtests.py | Application ran for: 0:00:43.195848 zombiecheck | Reading PID log: /tmp/tmphtVpxxpidlog ==> process 24368 launched child process 24396 zombiecheck | Checking for orphan process with PID: 24396 Stopping web server Stopping web socket server Stopping ssltunnel TEST-INFO | leakcheck | default process: leak threshold set at 0 bytes TEST-INFO | leakcheck | plugin process: leak threshold set at 0 bytes TEST-INFO | leakcheck | tab process: leak threshold set at 0 bytes TEST-INFO | leakcheck | geckomediaplugin process: leak threshold set at 20000 bytes TEST-INFO | leakcheck | gpu process: leak threshold set at 0 bytes == BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, default process 24368 |<----------------Class--------------->|<-----Bytes------>|<----Objects---->| | | Per-Inst Leaked| Total Rem| 0 |TOTAL | 36 56| 1014320 1| 398 |ModuleScript | 56 56| 1 1| nsTraceRefcnt::DumpStatistics: 1394 entries TEST-INFO | leakcheck | default process: leaked 1 ModuleScript 19 ERROR TEST-UNEXPECTED-FAIL | leakcheck | default process: 56 bytes leaked (ModuleScript) 20 ERROR TEST-UNEXPECTED-FAIL | leakcheck | default process: found leak of 1 JS GC things String: 1 runtests.py | Running tests: end. Buffered messages finished SUITE-END | took 44s Try is looking good: https://treeherder.mozilla.org/#/jobs?repo=try&revision=0c6d68bef5640d2ffe5be0ab8cd833ce4023529d&group_state=expanded How does this look to you?

Attachment #8930152 - Flags: feedback?(continuation)

Andrew McCreight (out of office until 8/21) [:mccr8]

Comment 9

•

8 years ago

Comment on attachment 8930152 [details] [diff] [review] bug1407593-check-js-leaks Review of attachment 8930152 [details] [diff] [review]: ----------------------------------------------------------------- This seems reasonable. Is there some reason to create a separate log rather than reporting the leaked objects to the leak checker through the normal means? Your loop could call MOZ_LOG_CTOR repeatedly instead. Adding an extra file seems like a lot of complexity. ::: dom/workers/RuntimeService.cpp @@ +1012,5 @@ > // The CC is shut down, and the superclass destructor will GC, so make sure > // we don't try to CC again. > mWorkerPrivate = nullptr; > + > + CycleCollectedJSRuntime::Shutdown(cx); Bug 1422316 is also filed on this missing shutdown. I don't know why it is like it is. ::: xpcom/base/nsTraceRefcnt.cpp @@ +1373,5 @@ > + aRuntime->TraceNativeGrayRoots(&tracer); > + > + AutoTraceLogLock lock; > + > + fprintf(gJSLeakLog, "Runtime %p leaked %" PRIu64 " JS GC things\n", It would be nice if this included a descriptive string, too. EG worker or main thread. I don't think we ever log the address of a runtime, so this will only be useful for somebody with a debugger.

Attachment #8930152 - Flags: feedback?(continuation) → feedback+

Jon Coppeard (:jonco)

Assignee

Comment 10

•

6 years ago

Picking this up again as this continues to be a problem.

Using MOZ_LOG_CTOR makes things much easier, thanks.

What we want to happen is that GC things that are live at shutdown are reported but are finalized anyway. It might be nice to also clear embedding pointers to GC things at shutdown because otherwise these are left dangling pointing to freed memory.

My current issue is that finalizing preserved wrappers causes an assertion in the wrapper cache destructor.

Jon Coppeard (:jonco)

Assignee

Updated

•

6 years ago

Depends on: 1570666

Jon Coppeard (:jonco)

Assignee

Comment 11

•

6 years ago

Attached file Bug 1407593 - Report embedding leaks of JS GC things r=mccr8,sfink? — Details

Patch to report JS GC things that are live at shutdown as leaks by calling MOZ_LOG_CTOR for them. The JS engine now asserts there are no roots, and the CC clears any remaining JS holders (which will already have been reported). I used Unlink() for this because otherwise the wrapper cache asserts if we try and clear the contents of a preserved wrapper from a leaked GC thing's finalizer.

The only fly in the ointment is the test_ext_native_messaging.js xpcshell test, because this patch causes it to fail with an assertion here:

https://searchfox.org/mozilla-central/source/dom/events/EventListenerManager.cpp#138

I'm assuming this is connected to the fact that it leaks (bug 1570666).

Jan de Mooij [:jandem]

Updated

•

6 years ago

Blocks: 1569655

Nicolas B. Pierron [:nbp]

Updated

•

6 years ago

Status: NEW → ASSIGNED

Priority: P3 → P1

Phabricator Automation

•

6 years ago

Attachment #8930152 - Attachment is obsolete: true

Jan de Mooij [:jandem]

Comment 14

•

6 years ago

\o/

(In reply to Jon Coppeard (:jonco) from comment #13)

Lando seems to have messed up the commit messaged. Filed bug 1573398.

I think the trailing ? in r=foo? confuses it. I've noticed it also doesn't remove it when there's a single reviewer.

Jon Coppeard (:jonco)

Assignee

Updated

•

6 years ago

Blocks: 1573503

Bogdan Tara[:bogdan_tara | bogdant]

Comment 15

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/f2d3df5525bd

Status: ASSIGNED → RESOLVED

Closed: 6 years ago

status-firefox70: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla70

Natalia Csoregi [:nataliaCs]

Updated

•

6 years ago

Regressions: 1574294

Ryan VanderMeulen [:RyanVM]

Updated

•

6 years ago

Regressions: 1573419

You need to log in before you can comment on or make changes to this bug.