Closed Bug 1262560 Opened 9 years ago Closed 8 years ago

[e10s] Investigate turning on em:multiprocessCompatible for Hello

Categories

(Hello (Loop) :: Client, defect)

defect
Not set
normal

Tracking

(e10s+)

RESOLVED INCOMPLETE
Tracking Status
e10s + ---

People

(Reporter: standard8, Unassigned)

References

Details

(Whiteboard: [waiting for 47 decisions on add-on shipping])

In bug 1257154, we tried turning on the multiprocessCompatible flag in install.rdf for Hello. It got backed out due to causing perma-leaks and crashes. > Backed out in https://hg.mozilla.org/mozilla-central/rev/d5f3da0cfe7c for > very frequent 10.10 opt e10s mochitest-5 crashes like > https://treeherder.mozilla.org/logviewer.html#?job_id=8211122&repo=fx-team > This also caused leaks in other Linux bc tests and in Windows 7 debug > M-e10s(dt1): > https://treeherder.mozilla.org/logviewer.html#?job_id=8210013&repo=fx-team > This also caused leaks in other Linux bc tests and in Windows 7 debug > M-e10s(dt1): > https://treeherder.mozilla.org/logviewer.html#?job_id=8210013&repo=fx-team This bug is to try and re-enable it without the leaks/crashes.
So far, I've mainly focussed on the leaks. I've tried what's mentioned below. Note: the bc4 & bc5 tests are the ones that are showing the leaks. Other tests may fail because of what I'm disabling. 1) Creating a new add-on with just a bare bootstrap and an install.rdf which specifies em:multiprocessCompatible and it didn't leak: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2251406d1666&selectedJob=19047397 2) Enabling em:multiprocessCompatible for Loop but removing everything in bootstrap.rdf and it *did* leak: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1c25a2a08a57&selectedJob=19051613 3) Same as 2 but with the uitour loop tests disabled, as well as packaging of our modules that are loaded into Chrome: https://treeherder.mozilla.org/#/jobs?repo=try&revision=59777b852858&selectedJob=19095890
The leaks appear to be along the lines of: 08:52:52 INFO - TEST-INFO | leakcheck | tab process: leaked 1 AsyncTransactionTrackersHolder 08:52:52 INFO - TEST-INFO | leakcheck | tab process: leaked 1 CompositorBridgeChild 08:52:52 INFO - TEST-INFO | leakcheck | tab process: leaked 3 CondVar 08:52:52 INFO - TEST-INFO | leakcheck | tab process: leaked 2 IPC::Channel 08:52:52 INFO - TEST-INFO | leakcheck | tab process: leaked 1 MessagePump 08:52:52 INFO - TEST-INFO | leakcheck | tab process: leaked 3 Mutex 08:52:52 INFO - TEST-INFO | leakcheck | tab process: leaked 1 PCompositorBridgeChild 08:52:52 INFO - TEST-INFO | leakcheck | tab process: leaked 1 PImageBridgeChild 08:52:52 INFO - TEST-INFO | leakcheck | tab process: leaked 2 RefCountedMonitor 08:52:53 INFO - TEST-INFO | leakcheck | tab process: leaked 4 RefCountedTask 08:52:53 INFO - WARNING | leakcheck | tab process: leaked too many SharedMemory (expected 0, got 14) 08:52:53 INFO - TEST-INFO | leakcheck | tab process: leaked 2 StoreRef 08:52:53 INFO - TEST-INFO | leakcheck | tab process: leaked 1 WaitableEventKernel 08:52:53 INFO - TEST-INFO | leakcheck | tab process: leaked 2 WeakReference<MessageListener> 08:52:53 INFO - TEST-INFO | leakcheck | tab process: leaked 1 base::Thread 08:52:53 INFO - TEST-INFO | leakcheck | tab process: leaked 2 ipc::MessageChannel 08:52:53 INFO - TEST-INFO | leakcheck | tab process: leaked 6 nsTArray_base 08:52:53 INFO - TEST-INFO | leakcheck | tab process: leaked 1 nsThread Bill, any ideas where we should start looking for these?
Flags: needinfo?(wmccloskey)
Blocks: loop-e10s
I've just done another stream of try builds disabling stuff to see if I can narrow it down a bit.
So in [1] I've reduced Loop down to an empty bootstrap.js and an install.rdf (with the compatibility flag enabled). This still apparently causes leaks. I'd double-checked, and [1] was based off of [2] which was a green fx-team build. However, an earlier attempt [3] at creating an additional add-on which did exactly the same does not appear to leak. I'm stumped... My best guess is some weird platform bug. [1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=26da817887b1&selectedJob=19142120 [2] https://treeherder.mozilla.org/#/jobs?repo=fx-team&revision=1725b460c3e0de97590cb8764df9ace9115b813e [3] https://treeherder.mozilla.org/#/jobs?repo=try&revision=2251406d1666
Component: Client → General
Product: Hello (Loop) → Toolkit
Flags: needinfo?(mconley)
Out of curiosity, what happens if you remove the multiprocess compatible flags, but disable the shims by setting extensions.interposition.enabled to false?
Flags: needinfo?(mconley) → needinfo?(standard8)
It might also be worth getting a loaner machine, and using rr to try to get a recording of this failure, and try to find the root of what's not being deallocated here. I'm not too savvy with rr, but I expect a bunch of other Mozillians are.
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #5) > Out of curiosity, what happens if you remove the multiprocess compatible > flags, but disable the shims by setting > extensions.interposition.enabled to false? Trying that here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=86945c4b717a
Flags: needinfo?(standard8)
Disabling the pref didn't work - there's too many other items also using the shims. Andrew/Nicolas, would one of you two be able to help us here please?
Flags: needinfo?(n.nethercote)
Flags: needinfo?(continuation)
I've done a fresh push for debug builds with just the flag set. This should clarify which builds/tests are leaking: https://treeherder.mozilla.org/#/jobs?repo=try&revision=8f903b6cdbbf
I'm not much of an expert when it comes to Gecko leaks. Sorry.
Flags: needinfo?(n.nethercote)
Have you tried again more recently? The SharedMemory leak you are seeing looks like a known intermittent, bug 1246529, and that was fixed by something since you filed this bug. If not, you could try asking nical.
Flags: needinfo?(wmccloskey)
Flags: needinfo?(continuation)
Your try push in comment 9 is from 4/15, and that SharedMemory leak was only fixed very recently on Windows in bug 1262898, which landed on 4/22.
Depends on: 1262898
Thanks Andrew, trying a new push here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5eb1f3107692 Nical, are the recent fixes likely to be uplifted to beta 47 or aurora 48 (if they aren't already there)?
Flags: needinfo?(nical.bugzilla)
(In reply to Mark Banner (:standard8) from comment #13) > Thanks Andrew, trying a new push here: > > https://treeherder.mozilla.org/#/jobs?repo=try&revision=5eb1f3107692 > > Nical, are the recent fixes likely to be uplifted to beta 47 or aurora 48 > (if they aren't already there)? Both of the important shutdown fixes are in 48, but I would not feel good about uplifting them to beta 47 considering the amount of code and the trickiness involved (especially since they made it to nightly towards the end of the 48 cycle), unless there is a very strong case for it.
Flags: needinfo?(nical.bugzilla)
Thanks for the response Nical. I think enabling this from 48. Moving back to Hello product, since that's all that is left now. I'll discuss with the team and see how we want to roll this out.
Assignee: nobody → standard8
Component: General → Client
Product: Toolkit → Hello (Loop)
As we can't set the flag for the older versions, we're waiting until we've got the 47 versions near ready to be released before we enable this flag. We haven't quite worked out what is being shipped with 47 yet for Hello, but once we do, then we'll reconsider when we can enable this. We do want to do it in time for the 48 cycle though.
Whiteboard: [waiting for 47 decisions on add-on shipping]
Assignee: standard8 → nobody
Support for Hello/Loop has been discontinued. https://support.mozilla.org/kb/hello-status Hence closing the old bugs. Thank you for your support.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.