Closed
Bug 1124728
Opened 10 years ago
Closed 10 years ago
All talos on OSX fails with running in e10s mode
Categories
(Testing :: Talos, defect)
Tracking
(e10sm7+, firefox38 wontfix, firefox38.0.5 wontfix, firefox39 wontfix, firefox40 wontfix, firefox41 fixed, firefox-esr31 unaffected, firefox-esr38 wontfix)
RESOLVED
FIXED
mozilla41
People
(Reporter: RyanVM, Assigned: mconley)
References
(Depends on 1 open bug, Blocks 1 open bug, )
Details
(Keywords: intermittent-failure)
Attachments
(5 files, 5 obsolete files)
21.92 KB,
text/plain
|
Details | |
8.54 KB,
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
3.61 KB,
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
4.35 KB,
patch
|
wlach
:
review+
|
Details | Diff | Splinter Review |
12.95 KB,
application/zip
|
Details |
Per Joel's request on IRC, I'm hiding them for now.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Reporter | ||
Comment 4•10 years ago
|
||
Also, talos-other on e10s is still being seen as a non-e10s job, so I had to hide T(o) on 10.6 as well :(
Comment 5•10 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #4)
> Also, talos-other on e10s is still being seen as a non-e10s job, so I had to
> hide T(o) on 10.6 as well :(
It wasn't listed in the set of jobs in bug 1121003.
Comment 7•10 years ago
|
||
:billm, do you know why talos runs on osx 10.6 would be failing when we try to run in e10s mode? We can run talos with the same code in e10s mode for linux/windows, and mochitests for e10s run on 10.6.
Flags: needinfo?(wmccloskey)
I'm not sure I understand what's going on here. Is this about running Talos with multiple processes, or is it about using the message manager more in Talos code while still running in a single process?
Either way I'm afraid I can't think of any reason why we'd time out. It looks like Firefox isn't even starting up?
Flags: needinfo?(wmccloskey)
Comment 9•10 years ago
|
||
this is about running with browser.tabs.remote = true as we do for linux and windows. Right now this isn't working at all for talos on osx. I suspect we are starting up, but not loading the webpage or having issues printing info and quitting the browser.
Updated•10 years ago
|
Blocks: e10s-harness
tracking-e10s:
--- → +
Comment 10•10 years ago
|
||
Requesting re-triage, releng would like someone to look at why osx is having problems getting tests launched and running with e10s.
Updated•10 years ago
|
Updated•10 years ago
|
Assignee: nobody → davidp99
Comment 11•10 years ago
|
||
Joel,
In looking over the build log, gcc is missing from the build machine, causing the build of psutil to fail. jimm suggested that you might have an idea what happened there. Do you know why gcc isn't being found on the OS 10.6 machines and if its needed? Do we need to change the scripts to use llvm or something?
Flags: needinfo?(jmaher)
Comment 12•10 years ago
|
||
these are run on test slaves (not build machines) so we don't have compilers installed. In fact this psutil error is present on all flavors (look at passing linux machines).
We install psutil as a python module which happens to include some native library components. In appears this is and has been known for a while in bug 917346, and bug 893254.
Are you looking into why these don't run at all? I know :avih tried running Talos locally on osx 10.10 and it failed to run with using the --e10s switch (hung at the warmup window, just like in automation). Here are some instructions for running it locally:
https://wiki.mozilla.org/Buildbot/Talos/Running#Running_locally_-_Source_Code
Flags: needinfo?(jmaher)
Comment 14•10 years ago
|
||
This looks to be related to the teardown of nsAppService and friends -- the destruction process is very involved and has some mac-specific (#ifdef) behavior. An early look seems to suggest that the teardown process is confusing the nsAppService instance in the main process and the one in the content process -- the Quit procedure operates as a state machine transition (it is re-entrant and called multiple times) but the evolution of the states must be done for both main and content. The process involving Quit is conflating the states of the two instances, leading to a place where they don't make the expected transitions and are therefore never actually destroyed.
Flags: needinfo?(davidp99)
Comment 15•10 years ago
|
||
oh great, thanks for looking into this. Do feel free to ask any questions about talos!
Comment 16•10 years ago
|
||
David, any idea how we can fix this?
Either by modifying firefox, or talos, or both, or using prefs to overcome this, etc?
What would be a viable approach here, and who has the knowledge to handle it?
The fact is that we have talos mostly running on e10s firefox everywhere except OS X, and the lack of performance data on this platform might be hiding performance issues which no one is able to detect or track due to this.
Flags: needinfo?(davidp99)
Comment 17•10 years ago
|
||
We're going to try to get to this between m5 and m6.
Comment 18•10 years ago
|
||
Nominating for e10s triage as per jimm's comment. We'll address the milestone this week.
Flags: needinfo?(davidp99)
Assignee | ||
Updated•10 years ago
|
Assignee: davidp99 → mconley
Assignee | ||
Updated•10 years ago
|
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 33•10 years ago
|
||
Hey Avi,
This bug is milestoned for M7. We're about halfway through M6, and then we'll start to address the M7's.
Flags: needinfo?(mconley)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 220•10 years ago
|
||
Would it be possible for the robot to not post so frequently?
This is perma-fail right now (and always has been), and we know it. This bug was opened to handle the issue, but the frequent posts from the robot will make people un-CC themselves from this bug.
Reporter | ||
Comment 221•10 years ago
|
||
They aren't even stars for the correct bug. This bug is specifically about Talos e10s on 10.6, not the widespread timeouts we're hitting on the non-e10s jobs.
But yes, we can hide the jobs and stop starring them.
Flags: needinfo?(cbook)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 230•10 years ago
|
||
I have an OSX loaner and ran talos on it. Talos appears to work correctly except that the process doesn't fully terminate (despite the window closing).
Oddly enough a 'kill -9 <pid>', results in continuing the test and the test finishes (closes down itself and the process just fine).
could it be that we are closing the browser too fast?
Comment 231•10 years ago
|
||
hmm, waiting to close the browser doesn't necessarily help, in fact clicking the 'x' doesn't close the browser. is there something about an uninitialized profile that could be causing this behaviour on osx and e10s?
Assignee | ||
Comment 232•10 years ago
|
||
The browser not closing if you click the X - that can happen if the content process is hung. Does the content process list as Not Responding in the Activity Monitor when in that state?
Comment 233•10 years ago
|
||
in activity monitor, I don't see any indication of 'not responding' or other stuff. The cpu activity seems to be changing, so it is doing something.
Comment 234•10 years ago
|
||
on a non e10s run, I see a javascript error for resource:///modules/WebappManager.jsm, line 48 NS_ERROR_FAILURE...nsIObserverService.removeObserver
Comment 235•10 years ago
|
||
A stack of the busy content process main thread after trying to close the window would be useful.
Comment 236•10 years ago
|
||
any tips for how to do that?
Assignee | ||
Comment 237•10 years ago
|
||
Assuming you can get into the state reliably, open up OS X's Activity Manager, find the Nightly Web Content process, and click "Sample Process". In the window that comes up, copy and paste it to a text file, and attach to this bug. Thanks!
Comment 238•10 years ago
|
||
and we have a sample! Thanks for helping guide me through some of this.
Attachment #8592404 -
Flags: feedback?(mconley)
Assignee | ||
Comment 239•10 years ago
|
||
Looks like the process is garbage / cycle collecting.
What happens if you take a sample after a minute or so?
Comment 240•10 years ago
|
||
Attachment #8592406 -
Flags: feedback?(mconley)
Assignee | ||
Updated•10 years ago
|
Attachment #8592404 -
Attachment is obsolete: true
Attachment #8592404 -
Flags: feedback?(mconley)
Assignee | ||
Comment 241•10 years ago
|
||
Comment on attachment 8592406 [details]
talos + e10s + osx 10.6 + activity monitor + a few minutes idle time
So this is an interesting one. The content process appears to just be sitting there spinning the event loop. I suspect this is the one we need to solve.
Attachment #8592406 -
Flags: feedback?(mconley)
Assignee | ||
Comment 242•10 years ago
|
||
Can I assume that, since we're not running 10.6 Talos jobs anymore, this bug is WONTFIX?
Flags: needinfo?(jmaher)
Comment 243•10 years ago
|
||
maybe, I believe the same failure happens on osx 10.10. I need to get the tests scheduled- if you have time to work on it soon, I can get the tests scheduled.
Flags: needinfo?(jmaher)
Flags: needinfo?(cbook)
Assignee | ||
Comment 244•10 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #243)
> maybe, I believe the same failure happens on osx 10.10. I need to get the
> tests scheduled- if you have time to work on it soon, I can get the tests
> scheduled.
Yeah, we're just coming around to M7 now, which is where this bug is milestoned at. I can look at it on Monday.
Comment 245•10 years ago
|
||
As far as I understand/know, the title should probably be changed to "All talos e10s tests always hang on all OS X versions"
Joel, Is that correct?
Flags: needinfo?(jmaher)
Comment 246•10 years ago
|
||
that is correct
Flags: needinfo?(jmaher)
Summary: Frequent OSX 10.6 T-e10s "timed out after 3600 seconds of no output" → All talos on OSX fails with running in e10s mode
Assignee | ||
Comment 247•10 years ago
|
||
Ok, I can reproduce this hang on my local 10.6 machine. Investigating...
Assignee | ||
Comment 248•10 years ago
|
||
I think I've figured this out.
We have this getInfo.html page that loads before some (all?) tests are run in order to get the screen dimensions and browser dimensions, which it then dumps into a log that (I assume) Talos consumes to make decisions on how tests should operate under those conditions.
One of the last things that getInfo.html does is call goQuitApplication(); and window.close();. goQuitApplication() is defined in talos/page_load_test/quit.js. goQuitApplication() uses special privileges and calls nsIAppStartup::Quit passing in nsIAppStartup.eForceQuit.
That getInfo.html page is being loaded in the content process when the --e10s flag is set.
When the content process calls goQuitApplication(), the content process successfully shuts down, which effectively closes the window. On Windows and Linux, this closing of the last window results in the parent process also shutting down.
On OS X, however, a process still runs even if all of the windows are closed. You can reproduce this in Firefox today - just close all of your windows by pressing the little red button in the top-left corner on OS X. The Firefox process will keep running, and Firefox will be available in the dock. When you click on the dock icon, a new Firefox window opens.
So that's why this is happening - forceQuit is happening in the content process and not the parent, and the parent can keep running without any windows.
This also means that when talos tests do run (when I manually quit the getInfo.html process), that Talos is killing the parent process after each test because it hits the time out, since (again) the parent process is just hanging out since the quit didn't occur properly.
What I suggest is that we have getInfo.html (and any other content that tries to make use of quit) send a message up to the parent to cause the quit.
jmaher - where would be a good place to put a message handler for that quit message? Ideally it'd be in an add-on somewhere... is pageloader the right place to put that kind of general Talos utility for quitting?
Flags: needinfo?(jmaher)
Comment 249•10 years ago
|
||
great question and thanks for taking a look at this! Ok, so we have a slight difference in the OS. this is probably why when we actually run the tests after manually killing the process running getinfo.html we continue on.
pageloader is not present in all tests, for example tpaint, ts_paint, sessionrestore*, tresize, and media_tests.
we could consider adding pageloader to them, or creating a really small addon that handles quit functionality. Alternatively we could just have the addon for the getinfo as well if we don't want to affect the other test numbers.
Maybe we can chat sometime in the next 2 days about it since I will be in your home office. This might be easier to brainstorm that way.
Flags: needinfo?(jmaher)
Assignee | ||
Comment 250•10 years ago
|
||
So, having talked with jmaher in person, the plan at this time is to create a very small default add-on to be installed in the browser before any talos test that can add the listeners for a quit message from content.
It'll have to be very small and light in order to minimize noise in the talos results.
Assignee | ||
Comment 251•10 years ago
|
||
Assignee | ||
Comment 252•10 years ago
|
||
Comment on attachment 8602292 [details] [diff] [review]
[WIP] Add talos-powers default add-on to allow talos tests to request force quit. r=?
So this gets things moving for me, locally.
Joel, can you test this to see how much noise, if any, this would add to our talos numbers?
Attachment #8602292 -
Flags: feedback?(jmaher)
Comment 253•10 years ago
|
||
Comment on attachment 8602292 [details] [diff] [review]
[WIP] Add talos-powers default add-on to allow talos tests to request force quit. r=?
Review of attachment 8602292 [details] [diff] [review]:
-----------------------------------------------------------------
thanks for this:
a try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=beb62e8bce0d
compare view:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=98d3519ab098&newProject=try&newRevision=beb62e8bce0d
^ here I looked at the stddev and specifically the % with it, then looked in details on anything that matches new_stdev% >= 2*old_stddev%
a few issues:
session restore no auto - win8 <- the graph view seems to have data not in any range
session restore* <- data is showing way out of the normal range.
code wise, this patch is good!
Attachment #8602292 -
Flags: feedback?(jmaher) → feedback+
Comment 254•10 years ago
|
||
in addition, this patch needs to add entries to talos/mtio-whitelist.json;
TEST-UNEXPECTED-FAIL | mainthreadio | File '{talos}-powers@mozilla.org\plugins' was accessed and we were not expecting it: {'Count': 1, 'Duration': 0.029184, 'RunCount': 1}
TEST-UNEXPECTED-FAIL | mainthreadio | File '{talos}-powers@mozilla.org' was accessed and we were not expecting it: {'Count': 3, 'Duration': 0.10636800000000002, 'RunCount': 3}
TEST-UNEXPECTED-FAIL : xperf: File '{profile}\extensions\talos-powers@mozilla.org\chrome\talos-powers-content.js' was accessed and we were not expecting it. DiskReadCount: 2, DiskWriteCount: 0, DiskReadBytes: 184, DiskWriteBytes: 0
TEST-UNEXPECTED-FAIL : xperf: File '{profile}\extensions\talos-powers@mozilla.org\chrome.manifest' was accessed and we were not expecting it. DiskReadCount: 2, DiskWriteCount: 0, DiskReadBytes: 248, DiskWriteBytes: 0
we just need to add these specific files to the mtio-whitelist.json file, nothing too crazy.
Assignee | ||
Comment 255•10 years ago
|
||
Assignee | ||
Updated•10 years ago
|
Attachment #8602292 -
Attachment is obsolete: true
Assignee | ||
Comment 256•10 years ago
|
||
Assignee | ||
Updated•10 years ago
|
Attachment #8602910 -
Attachment is obsolete: true
Assignee | ||
Comment 257•10 years ago
|
||
Comment on attachment 8602911 [details] [diff] [review]
[WIP] Add talos-powers default add-on to allow talos tests to request force quit. r=?
Can we try this instead to see if the Session Store test numbers settle? (I've tried to move all of the injection work to after the windows have been restored).
Also, instead of me handing this off to you every time (since we might iterate on this a bit), can you let me know how to test this?
Attachment #8602911 -
Flags: feedback?(jmaher)
Comment 258•10 years ago
|
||
awesome here, I will have some results for this in the next 30 minutes!
Comment 259•10 years ago
|
||
oh, the session restore stuff is awesome with this latest fix. Lets add the xperf bits and I can review the code one more time- then the landings will start.
Updated•10 years ago
|
Attachment #8602911 -
Flags: feedback?(jmaher) → feedback+
Assignee | ||
Comment 260•10 years ago
|
||
Assignee | ||
Updated•10 years ago
|
Attachment #8602911 -
Attachment is obsolete: true
Assignee | ||
Comment 261•10 years ago
|
||
Assignee | ||
Updated•10 years ago
|
Attachment #8602998 -
Attachment is obsolete: true
Assignee | ||
Updated•10 years ago
|
Attachment #8603000 -
Flags: review?(jmaher)
Comment 262•10 years ago
|
||
Comment on attachment 8603000 [details] [diff] [review]
Add talos-powers default add-on to allow talos tests to request force quit. r=?
Review of attachment 8603000 [details] [diff] [review]:
-----------------------------------------------------------------
thanks, this is great.
::: talos/talos-powers/components/TalosPowersService.js
@@ +9,5 @@
> + "resource://gre/modules/Services.jsm");
> +
> +const FRAME_SCRIPT = "chrome://talos-powers/content/talos-powers-content.js";
> +
> +function TalosPowersService() {}
I know ; are not required, all but 2 lines here have it- we should use it for consistency.
Attachment #8603000 -
Flags: review?(jmaher) → review+
Comment 263•10 years ago
|
||
please land on talos- I will get it landed on trunk, there is an android update I need to do as well.
Assignee | ||
Comment 264•10 years ago
|
||
Thanks, semi-colons added!
remote: https://hg.mozilla.org/build/talos/rev/ae8a6f232e7b
Do we close this now, or when we get this deployed?
Flags: needinfo?(jmaher)
Comment 265•10 years ago
|
||
bug 1162753 for deployment, try push for final testing.
Comment 266•10 years ago
|
||
hmm, this is frustrating- this latest patch is causing failures on the 's' group:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=672324392d7c
we appear to run svgx, tsvg_opacity, but when running 'cart', we fail to get going. I can look into this a bit, should be easy to reproduce.
Flags: needinfo?(jmaher)
Comment 267•10 years ago
|
||
on linux we appear to collect all the data, but we are not sending information up to the server. I wonder if the quit is causing us to not properly close the handle to our logging file 'browser_output.txt' which would make us experience a lack of data to post.
Assignee | ||
Comment 268•10 years ago
|
||
Hrm, that's frustrating. :( Let me know if you want me to dig at it too.
Assignee | ||
Comment 269•10 years ago
|
||
Hey jmaher - did you ever find out what was going on here?
Flags: needinfo?(jmaher)
Comment 270•10 years ago
|
||
I did not, nor did I get a chance to hack on it this weekend or Monday. Given that this only happens on the 's' job, it probably means that the tart/cart addon is blocking access to our output file browser_output.txt. I really need an hour or so to focus on this and might be able to figure it out. If you have a few cycles, I would appreciate any thoughts.
Flags: needinfo?(jmaher)
Comment 271•10 years ago
|
||
reproduced this locally, tart seems to not have data:
16:22:32 INFO - INFO : Could not find report in browser output: [('tsformat', ('__start_report', '__end_report')), ('tpformat', ('__start_tp_report', '__end_tp_report'))] [browser_output.txt]
will narrow down and verify on try.
Comment 272•10 years ago
|
||
switching about:newtab -> about:blank in tart.js on lines 527 and 539 allows this to work:
http://hg.mozilla.org/build/talos/file/29df8fbf1358/talos/page_load_test/tart/addon/content/tart.js#l526
for some reason the about:newtab is causing talos-powers to see a shutdown message.
Assignee | ||
Comment 273•10 years ago
|
||
So this is a sad story.
With my first patch, getInfo.html now gets these sweet powers to kill the whole browser process as soon as it loads.
That's great, except that about:newtab shows thumbnails of all of the most frecent visits. For the talos profile running the tart test, the only entry in that group is getInfo.html.
And so to gather the thumbnail, we open a remote browser, and load the content in order to capture the thumbnail.
Ta-da, whole browser goes down because the background thumbnail process just told the whole app to quit.
I have a workaround solution, but we should probably come up with something longer term, like removing getInfo.html from the session history.
Assignee | ||
Comment 274•10 years ago
|
||
Assignee | ||
Updated•10 years ago
|
Attachment #8603000 -
Attachment is obsolete: true
Assignee | ||
Comment 275•10 years ago
|
||
Comment on attachment 8604974 [details] [diff] [review]
Follow-up: Do not allow getInfo.html to quit parent application when loaded in thumbnailer process. r=?
We can either do this, or try to remove getInfo.html from the history - though I'm not sure exactly how best to do that, or if anything actually relies on that being in the history.
Attachment #8604974 -
Flags: review?(jmaher)
Comment 276•10 years ago
|
||
Comment on attachment 8604974 [details] [diff] [review]
Follow-up: Do not allow getInfo.html to quit parent application when loaded in thumbnailer process. r=?
Review of attachment 8604974 [details] [diff] [review]:
-----------------------------------------------------------------
ok, this is hacky, but I don't know a better form of hacky. We don't depend on the thumbnail at all, but I suspect it will change our tart/cart numbers. How about we put some comments in here so when thumbnails change in the future we can reduce our hair pulling time.
Attachment #8604974 -
Flags: review?(jmaher) → review+
Comment 277•10 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #276)
> ok, this is hacky, but I don't know a better form of hacky. We don't depend
> on the thumbnail at all, but I suspect it will change our tart/cart numbers.
> How about we put some comments in here so when thumbnails change in the
> future we can reduce our hair pulling time.
CART doesn't care about thumbnails, but TART does. Well, TART itself doesn't care, but we know that the numbers depend on whatever shows up at the newtab page because it can affect the animation.
Last time we've had a related discussion, it was when tiles made it into Firefox and IIRC they were eventually kept enabled such that TART measure newtab animation with tiles.
IIRC the behavior was such that due to the new profile, no page had high enough frecency to show up at the newtab page during the tests. So I wonder what changed that now it shows the getinfo page.
Tim, thoughts?
Assignee | ||
Comment 278•10 years ago
|
||
Landed follow-up:
remote: https://hg.mozilla.org/build/talos/rev/fbc3ae2a26e7
Assignee | ||
Updated•10 years ago
|
Attachment #8603000 -
Attachment is obsolete: false
Comment 279•10 years ago
|
||
oh, this was a fun one, but a good thing to find. 2 things of interest:
1) We had originally added talos-powers to the mainthreadio whitelist (which was needed). There is a problem with that in the script was looking for 'talos' and converting .*talos -> {talos}, so profile/extensions/talos-powers/... -> {talos}-powers/...; we don't use {talos} in mainthreadio, so I removed the talos substitutions.
2) etlparser looks for disk access in general (not just mainthread io counts). this is the same job and has a slightly different error message. We rarely have to modify this file, but today we got lucky and have a few things to modify.
3) and for a bonus we seem to try to access a /searchplugins folder in an extension subdir of the profile by default. That is added as well!
Attachment #8605899 -
Flags: review?(wlachance)
Comment 280•10 years ago
|
||
Comment on attachment 8605899 [details] [diff] [review]
add talos-powers to mainthreadio and etlparser manifests (1.0)
lgtm
Attachment #8605899 -
Flags: review?(wlachance) → review+
Comment 281•10 years ago
|
||
Assignee | ||
Comment 282•10 years ago
|
||
Hey Joel - with your patch in bug 1166906 landed, are we good to close this, or are we still blocked on something?
Flags: needinfo?(jmaher)
Comment 283•10 years ago
|
||
thanks for the reminder
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(jmaher)
Resolution: --- → FIXED
Reporter | ||
Updated•10 years ago
|
status-firefox38:
--- → wontfix
status-firefox38.0.5:
--- → wontfix
status-firefox39:
--- → wontfix
status-firefox40:
--- → wontfix
status-firefox41:
--- → fixed
status-firefox-esr31:
--- → unaffected
status-firefox-esr38:
--- → wontfix
Target Milestone: --- → mozilla41
Assignee | ||
Comment 284•9 years ago
|
||
bugnotes |
You need to log in
before you can comment on or make changes to this bug.
Description
•