1124728 - All talos on OSX fails with running in e10s mode

Ryan VanderMeulen [:RyanVM]

Reporter

Description

•

10 years ago

Per Joel's request on IRC, I'm hiding them for now.

Comment hidden (Legacy TBPL/Treeherder Robot)

Ryan VanderMeulen [:RyanVM]

Reporter

Comment 4

•

10 years ago

Also, talos-other on e10s is still being seen as a non-e10s job, so I had to hide T(o) on 10.6 as well :(

Ed Morley [:emorley]

Comment 5

•

10 years ago

(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #4) > Also, talos-other on e10s is still being seen as a non-e10s job, so I had to > hide T(o) on 10.6 as well :( It wasn't listed in the set of jobs in bug 1121003.

Joel Maher ( :jmaher ) (UTC -8)

Comment 7

•

10 years ago

:billm, do you know why talos runs on osx 10.6 would be failing when we try to run in e10s mode? We can run talos with the same code in e10s mode for linux/windows, and mochitests for e10s run on 10.6.

Flags: needinfo?(wmccloskey)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 8

•

10 years ago

I'm not sure I understand what's going on here. Is this about running Talos with multiple processes, or is it about using the message manager more in Talos code while still running in a single process? Either way I'm afraid I can't think of any reason why we'd time out. It looks like Firefox isn't even starting up?

Flags: needinfo?(wmccloskey)

Joel Maher ( :jmaher ) (UTC -8)

Comment 9

•

10 years ago

this is about running with browser.tabs.remote = true as we do for linux and windows. Right now this isn't working at all for talos on osx. I suspect we are starting up, but not loading the webpage or having issues printing info and quitting the browser.

Brad Lassey [:blassey] (use needinfo?)

Updated

•

10 years ago

Blocks: e10s-harness

tracking-e10s: --- → +

Jim Mathies [:jimm]

Comment 10

•

10 years ago

Requesting re-triage, releng would like someone to look at why osx is having problems getting tests launched and running with e10s.

tracking-e10s: + → ?

Jim Mathies [:jimm]

Updated

•

10 years ago

URL: https://treeherder.mozilla.org/#/jobs...

Brad Lassey [:blassey] (use needinfo?)

Updated

•

10 years ago

Assignee: nobody → davidp99

David Parks [:handyman]

Comment 11

•

10 years ago

Joel, In looking over the build log, gcc is missing from the build machine, causing the build of psutil to fail. jimm suggested that you might have an idea what happened there. Do you know why gcc isn't being found on the OS 10.6 machines and if its needed? Do we need to change the scripts to use llvm or something?

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 12

•

10 years ago

these are run on test slaves (not build machines) so we don't have compilers installed. In fact this psutil error is present on all flavors (look at passing linux machines). We install psutil as a python module which happens to include some native library components. In appears this is and has been known for a while in bug 917346, and bug 893254. Are you looking into why these don't run at all? I know :avih tried running Talos locally on osx 10.10 and it failed to run with using the --e10s switch (hung at the warmup window, just like in automation). Here are some instructions for running it locally: https://wiki.mozilla.org/Buildbot/Talos/Running#Running_locally_-_Source_Code

Flags: needinfo?(jmaher)

David Parks [:handyman]

Comment 13

•

10 years ago

NIing myself to update this

tracking-e10s: ? → +

Flags: needinfo?(davidp99)

David Parks [:handyman]

Comment 14

•

10 years ago

This looks to be related to the teardown of nsAppService and friends -- the destruction process is very involved and has some mac-specific (#ifdef) behavior. An early look seems to suggest that the teardown process is confusing the nsAppService instance in the main process and the one in the content process -- the Quit procedure operates as a state machine transition (it is re-entrant and called multiple times) but the evolution of the states must be done for both main and content. The process involving Quit is conflating the states of the two instances, leading to a place where they don't make the expected transitions and are therefore never actually destroyed.

Flags: needinfo?(davidp99)

Joel Maher ( :jmaher ) (UTC -8)

Comment 15

•

10 years ago

oh great, thanks for looking into this. Do feel free to ask any questions about talos!

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

10 years ago

Blocks: 1130445

Avi Halachmi (:avih)

Comment 16

•

10 years ago

David, any idea how we can fix this? Either by modifying firefox, or talos, or both, or using prefs to overcome this, etc? What would be a viable approach here, and who has the knowledge to handle it? The fact is that we have talos mostly running on e10s firefox everywhere except OS X, and the lack of performance data on this platform might be hiding performance issues which no one is able to detect or track due to this.

Flags: needinfo?(davidp99)

Jim Mathies [:jimm]

Comment 17

•

10 years ago

We're going to try to get to this between m5 and m6.

David Parks [:handyman]

Comment 18

•

10 years ago

Nominating for e10s triage as per jimm's comment. We'll address the milestone this week.

tracking-e10s: + → ?

Flags: needinfo?(davidp99)

talos + e10s + osx 10.6 + activity monitor sample 10 years ago Joel Maher ( :jmaher ) (UTC -8) 61.01 KB, text/plain		Details
talos + e10s + osx 10.6 + activity monitor + a few minutes idle time 10 years ago Joel Maher ( :jmaher ) (UTC -8) 21.92 KB, text/plain		Details
[WIP] Add talos-powers default add-on to allow talos tests to request force quit. r=? 10 years ago Mike Conley (:mconley) (:⚙️) 5.73 KB, patch	jmaher : feedback+	Details \| Diff \| Splinter Review
[WIP] Add talos-powers default add-on to allow talos tests to request force quit. r=? 10 years ago Mike Conley (:mconley) (:⚙️) 7.48 KB, patch		Details \| Diff \| Splinter Review
[WIP] Add talos-powers default add-on to allow talos tests to request force quit. r=? 10 years ago Mike Conley (:mconley) (:⚙️) 6.63 KB, patch	jmaher : feedback+	Details \| Diff \| Splinter Review
Add talos-powers default add-on to allow talos tests to request force quit. r=? 10 years ago Mike Conley (:mconley) (:⚙️) 8.54 KB, patch		Details \| Diff \| Splinter Review
Add talos-powers default add-on to allow talos tests to request force quit. r=? 10 years ago Mike Conley (:mconley) (:⚙️) 8.54 KB, patch	jmaher : review+	Details \| Diff \| Splinter Review
Follow-up: Do not allow getInfo.html to quit parent application when loaded in thumbnailer process. r=? 10 years ago Mike Conley (:mconley) (:⚙️) 3.61 KB, patch	jmaher : review+	Details \| Diff \| Splinter Review
add talos-powers to mainthreadio and etlparser manifests (1.0) 10 years ago Joel Maher ( :jmaher ) (UTC -8) 4.35 KB, patch	wlach : review+	Details \| Diff \| Splinter Review
Bugnotes 9 years ago Mike Conley (:mconley) (:⚙️) 12.95 KB, application/zip		Details