Intermittent talos "TalosError: timeout"

RESOLVED WORKSFORME

Status

defect
P3
normal
RESOLVED WORKSFORME
5 years ago
8 months ago

People

(Reporter: emorley, Assigned: rwood)

Tracking

(Blocks 1 bug, {intermittent-failure})

Trunk
x86_64
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [stockwell unknown])

Attachments

(1 attachment)

Reporter

Description

5 years ago
Ubuntu HW 12.04 mozilla-inbound talos g1 on 2014-08-12 00:06:30 PDT for push a208696de65f

slave: talos-linux32-ix-049

https://tbpl.mozilla.org/php/getParsedLog.php?id=45732074&tree=Mozilla-Inbound

{
00:21:51     INFO -  Cycle 1(6): loaded http://localhost/page_load_test/tp5n/mashable.com/mashable.com/index.html (next: http://localhost/page_load_test/tp5n/media.photobucket.com/media.photobucket.com/image/funny%20gif/findstuff22/Best%20Images/Funny/funny-gif1.jpg@o=1.html)
00:21:51     INFO -  RSS: Main: 164773888
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/ajax.googleapis.com/ajax/libs/jquery/1.4/jquery.min.js@ver=3.0.5, line 39: missing name after . operator
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/9.mshcdn.com/follow/packages/wp.js@1302035106, line 1: syntax error
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/9.mshcdn.com/wp-content/themes/v7/js/core.js@1297363131, line 107: jQuery is not defined
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/mashable.com/index.html, line 5: unterminated string literal
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/mashable.com/index.html, line 76: missing ; before statement
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/stats.wordpress.com/e-201116.js, line 1: syntax error
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/mashable.com/index.html, line 77: st_go is not defined
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/static.fmpub.net/site/mashable, line 48: unterminated string literal
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/platform.twitter.com/widgets/11apps.html, line 1: NS_ERROR_DOM_BAD_DOCUMENT_DOMAIN: Illegal document.domain value
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/platform.twitter.com/widgets/HBO%20Customers%20Will%20Soon%20Be%20Able%20to%20Watch%20on%20iOS%20&%20Android%20Devices&count=vertical.html, line 1: NS_ERROR_DOM_BAD_DOCUMENT_DOMAIN: Illegal document.domain value
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/platform.twitter.com/widgets/How%20Immigration%20Activists%20Are%20Fighting%20Deportation%20Policy%20With%20Social%20Media&count=vertical.html, line 1: NS_ERROR_DOM_BAD_DOCUMENT_DOMAIN: Illegal document.domain value
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/all.js, line 63: unterminated string literal
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/platform.twitter.com/widgets/Facebook%20Expands%20Safety%20&%20Security%20Tools&count=vertical.html, line 1: NS_ERROR_DOM_BAD_DOCUMENT_DOMAIN: Illegal document.domain value
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/platform.twitter.com/widgets/mixedreviews.html, line 1: NS_ERROR_DOM_BAD_DOCUMENT_DOMAIN: Illegal document.domain value
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/platform.twitter.com/widgets/CNN%20Launches%20Android%20App&count=vertical.html, line 1: NS_ERROR_DOM_BAD_DOCUMENT_DOMAIN: Illegal document.domain value
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/platform.twitter.com/widgets/Chiddy%20Bang%20Member%20To%20Attempt%20Guinness%20World%20Record%20for%20Longest%20Freestyle%20Rap%20at%20MTV%20OMAs&count=vertical.html, line 1: NS_ERROR_DOM_BAD_DOCUMENT_DOMAIN: Illegal document.domain value
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/platform.twitter.com/widgets/Evernote%20Android%20App%20Gets%20Major%20Upgrade,%20Overtakes%20iPhone%20Version&count=vertical.html, line 1: NS_ERROR_DOM_BAD_DOCUMENT_DOMAIN: Illegal document.domain value
00:21:51     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/mashable.com/platform.twitter.com/widgets/nero.html, line 1: NS_ERROR_DOM_BAD_DOCUMENT_DOMAIN: Illegal document.domain value
01:07:23     INFO -  JavaScript error: http://locaException in thread Thread-2:
01:07:23    ERROR -  Traceback (most recent call last):
01:07:23     INFO -    File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
01:07:23     INFO -      self.run()
01:07:23     INFO -    File "/usr/lib/python2.7/threading.py", line 504, in run
01:07:23     INFO -      self.__target(*self.__args, **self.__kwargs)
01:07:23     INFO -    File "/home/cltbld/talos-slave/test/build/venv/local/lib/python2.7/site-packages/mozprocess/processhandler.py", line 710, in _processOutput
01:07:23     INFO -      self.onTimeout()
01:07:23     INFO -    File "/home/cltbld/talos-slave/test/build/venv/local/lib/python2.7/site-packages/talos/TalosProcess.py", line 67, in onTimeout
01:07:23     INFO -      raise TalosError("timeout")
01:07:23     INFO -  TalosError: timeout
01:07:24     INFO -  INFO : Browser exited with error code: 9
01:07:29     INFO -  INFO : RSS: Main: 94461952
01:07:29     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/163.com/show.mediav.com/s@type=1&db=mediav&pub=118_2620_36413&cus=0_0_0_0_0&wh=360x100&btype=1&js=1.html, line 6: mvas_14576 is not defined
01:07:29     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/163.com/www.163.com/index.html, line 3370: missing ; before statement
01:07:29     INFO -  JavaScript error: http://localhost/page_load_test/tp5n/163.com/www.163.com/index.html, line 1090: AChange is not defined
01:07:29     INFO -  Cycle 1(1): loaded http://localhost/page_load_test/tp5n/163.com/www.163.com/index.html (next: http://localhost/page_load_test/tp5n/56.com/www.56.com/index.html)
}
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Reporter

Updated

5 years ago
Summary: Intermittent Linux talos g1 "TalosError: timeout" → Intermittent talos g1 "TalosError: timeout"
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Observations: all but the first one and a single win7 in the middle were OSX 10.6

Also: the first one seems different (bad document.domain values; perhaps infra?)

and also all but the first and last were on 8/15....

This smells like infra to me...  or maybe something taking "longer" on 10.6 for some (external?) reason.

And I also don't understand why it got filed in WebRTC....
Flags: needinfo?(ryanvm)
Flags: needinfo?(ryanvm) → needinfo?(jmaher)
Reporter

Comment 16

5 years ago
(In reply to Randell Jesup [:jesup] from comment #15)
> And I also don't understand why it got filed in WebRTC....

The g1 job was requested in bug 964498, which is in WebRTC.
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
oh, this is an interesting bug, we had this problem a year ago on osx tp5o and switching to mozprocess solved it.  Now we have modified tp5o to do more stuff (scrolling) and we are seeing this again.

Looking at a few logs we are not getting hung up on a specific page, rather it appears to be a system resource/event or other firefox bug.

Now I question why this started on 8/12.  We updated talos on 8/8, so I would have expected to see this at least once on 8/9 or 8/10.  In fact it really started on 8/15 (the 12th might have been a different really random issue).

I recall another issue that showed up on talos + android on the 15th.  We have had these tests running since July 21, so 3+ weeks of solid runs, and then it fails.

We could hack on talos, or we could look at what changed?  Did some of our prefs change?  I know we have been doing a lot of non local work (prevent outside access).  Maybe some http cache stuff in the browser changed?

Avi, any thoughts on a first glance here
Ed, any thoughts on what might have landed around the 15th of August?
Flags: needinfo?(jmaher)
Flags: needinfo?(emorley)
Flags: needinfo?(avihpit)
Hmm.. nothing which I can think of... sorry.
Flags: needinfo?(avihpit)
Reporter

Comment 59

5 years ago
(In reply to Joel Maher (:jmaher) from comment #57)
> We could hack on talos, or we could look at what changed?  Did some of our
> prefs change?  I know we have been doing a lot of non local work (prevent
> outside access).  Maybe some http cache stuff in the browser changed?
> 
> Ed, any thoughts on what might have landed around the 15th of August?

g1 was unhidden on OS X and Windows on the 15th (bug 1033332) because the issue for which they had been hidden (bug 1033323) was no longer occurring - until then it had only been visible on Linux.
Flags: needinfo?(emorley)
ok, then I question the effectiveness of running this on OSX 10.6.  That is only a useful platform for finding certain graphics related issues (bug 990490).

Avi, thoughts on turning tp5o_scroll off for osx 10.6?
Flags: needinfo?(avihpit)
(In reply to Ed Morley (Away 25th Aug) [:edmorley] from comment #59)
> g1 was unhidden on OS X and Windows on the 15th (bug 1033332) because the
> issue for which they had been hidden (bug 1033323) was no longer occurring -
> until then it had only been visible on Linux.

What does hidden mean? that it doesn't run? that it doesn't show as errors on TBPL? that it doesn't create graph server regression emails?

Also, bug 1033323 is "fix g1 on windows/osx" but I don't see any patch and yet it was resolved fixed. Do we know why it started working?

Also #2, since g1 is two new tests and I wrote both, I'd appreciate next time being CC'ed.

(In reply to Joel Maher (:jmaher) from comment #60)
> ok, then I question the effectiveness of running this on OSX 10.6.  That is
> only a useful platform for finding certain graphics related issues (bug
> 990490).
> 
> Avi, thoughts on turning tp5o_scroll off for osx 10.6?

Do we know what doesn't work on 10.6? is it glterrain? tp5o_scroll? is it intermittent? do we know why it happens?

I wouldn't mind turning it off for 10.6, but how can we know it's not the same issue of bug 1033323? (which was "spontaneously fixed" AFAICT).
Flags: needinfo?(avihpit)
Ed or Joel,

Would you be able to describe the timeline and issues related to g1?

From what I understand/know now:

- It was deployed early July.

- Windows and OSX had permafailures (bug 1033323) so it was hidden, and only unhidden a week ago (bug 1033332).

- OS X 10.6 intermittent failures since it was unhidden (this bug).

- Linux intermittent errors (bug 1037619) since day 1 AFAICT - appears to be automation issue where the slave doesn't get fully updated to the proper talos revision.


Is the above correct? how can we approach it? Since I wrote both tests of g1, I can try to look at them (as I would have if I knew the scope of issues right when they landed).

I think we should first try to isolate if the cause is tp5o_scroll or glterrain. I'd suspect tp5o_scroll because glterrain is pretty straight forward and the scroll test is less so.

I say let's completely disable tp5o_scroll only for now, and see if these failures keep popping?
(In reply to Avi Halachmi (:avih) from comment #62)
> Is the above correct?

Also, are there more issues related to g1/tp5o_scroll/glterrain? have there been more issues?
I believe bug 1033323 was related to tests not working in general (media_tests had issues and we had naming issues in buildbot and tbpl).  I suspect these issues were fixed long before 8/15.

We had hidden this specific suite, hidden means it was running just not visible on tbpl or managed by the sheriffs.

The issue appears to be that the browser hangs on tp5o_scroll.  I want to be realistic here given the usefulness of testing on 10.6, that glterrain seems more valuable than tp5o_scroll, and we should look at disabling that.

Regarding talos 'g1' job, glterrain and tp5o_scroll, I don't know of other issues, although there might be some once in a real random while (i.e. 1 month) errors out there.
So right now, g1 - glterrain and tp5o_Scroll are running on all platforms, and the only known issues are:

1. This bug - intermittent timeout on 10.6 for tp5o_Scroll.

2. Bug 1037619 - intermittent "no definition for glterrain" on few specific linux machines, which we think is an automation issue where talos doesn't get fully updated for the test.

Correct?

If these are the only known issues, I think it's OK to disable tp5o_scroll on OS X 10.6.

Regardless, Bug 1037619 is still worrying because it might be the case where it manifests elsewhere, but only with g1 it ends in such explicit failures, while elsewhere it "just" manifests in stale talos copies - which is probably pretty bad.
Comment hidden (Legacy TBPL/Treeherder Robot)
jmaher confirmed 1,2 of comment 65 (on IRC). Let's disable tp5o_scroll on OS X 10.6 .
Reporter

Comment 68

5 years ago
(In reply to Avi Halachmi (:avih) from comment #61)
> Also #2, since g1 is two new tests and I wrote both, I'd appreciate next
> time being CC'ed.

This bug and bug 1033323 were all marked blocking bug 964498, which is the bug that added support for/scheduled the g1 tests (it helpfully doesn't state that in the summary, but that's what we tracked it back to at the time), and are also filed in the same component as that bug. The assignee of that bug was CCed here - I'm guessing you weren't CCed on that bug (or watching this component) and so didn't get the "new blocking bug added" notifications? Other than that, it's very hard to know who to CC, given the sheriffs aren't normally told about the who/what/where of a new job before it's switched on permafailing... :-)
Reporter

Comment 69

5 years ago
(In reply to Avi Halachmi (:avih) from comment #67)
> jmaher confirmed 1,2 of comment 65 (on IRC). Let's disable tp5o_scroll on OS
> X 10.6 .

sgtm :-)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Following comment 60 and comment 67 (three months ago), we should have disabled this on 10.6, but this test still runs (and occasionally fails) on 10.6 ...

Joel?
Flags: needinfo?(jmaher)
the root issue here is that we have python library conflicts related to how mozharness sets up and runs python.  I have done some work to fix this, now I need to make it backward compatible (so we can run talos on b2gXX and esrXX branches).  

So we can disable this on 10.6, in fact we should probably.  What other tests should we disable on 10.6?  It is difficult to just disable this on 10.6 with the current way jobs are defined, so whatever other tests we want to disable on 10.6 I can do all this work once and be done with it.
Flags: needinfo?(jmaher)
I don't have any tests I _want_ to disable on 10.6.

This specific test was failing, you suggested to disabled it on 10.6, we agreed we should some months ago already, but nothing happened.

This was just a friendly ping ;)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
fx-team guys*, while running the talos webgl** test, it sometimes errors with the following:

> JavaScript error: resource:///modules/CustomizableUI.jsm, line 1570: TypeError: aWindowPalette is undefined

Looking at the last few failures, it doesn't seem to be limited to a single platform (has win8 64, osx 10.6, 10.10, linux 64...)

Since the failure is intermittent, maybe there's some race condition? would anyone be able to look into that?


* I added mconley since I know he worked on Australis customize some time ago.

** The webgl test is basically running this page and collecting the results***: http://hg.mozilla.org/build/talos/raw-file/tip/talos/page_load_test/webgl/benchmarks/terrain/perftest.html

*** when it executes in talos - it doesn't use alert to show the results.
Flags: needinfo?(gijskruitbosch+bugs)
Note that there are also occasional timeouts (possibly caused by a stopped execution sue to detected syntax errors or just exception e,g, due to the customize js error), or other warnings, such as:

> JavaScript warning: chrome://browser/content/nsContextMenu.js, line 1523: SyntaxError: unreachable expression after semicolon-less return statement

This has been going on for more than 6 months now, but the CustomizableUI.jsm error apparently started about 6 weeks ago at comment 132 (2015-03-05).

Joel, some of the errors appear to be graph server unreachable (e.g. also on comment 132) and sometimes just "timeout".

Do you assess that this is a talos framework issue? if not, any idea what might be causing this? are there other tests which started failing around the same time with similar failure rate?
Flags: needinfo?(jmaher)

Comment 140

4 years ago
That error is probably unrelated and tracked in bug 1124217.
Flags: needinfo?(gijskruitbosch+bugs)
Depends on: 1154225
Depends on: 1124217
I have no idea why the graph server is randomly not working.  Talos doesn't update often, and if there was an issue it would be consistent.  This looks to be related to graph server causing problems.

There area few issues here:
1) general timeout (no specific to g1)
2) graph server not finding test names (usually a11yr, but that is the first in the list, tp5o has been problematic a bit as well)
3) other?

just like other unittests have intermittents, some of these errors are unrelated to talos or code.  I suspect early february was an issue with talos or firefox causing a timeout, the early march stuff was the new osx 10.10 machines (that was a dns issue).  The more recent stuff with graph server not finding the test name is frustrating.

the consequences here are we are not getting data for the entire set of tests in the suite for the given revision.  Talos already does retries, etc.  The last few messages are windows 8 specific, I am not sure why that has been problematic.

honestly I am inclined to ignore this until we see a larger scale pattern.
Flags: needinfo?(jmaher)
So are these intermittent failures unique to "g1"? are there any intermittent failures which are unique to "g1"? are there maybe other talos intermittent failures with similar symptoms to some of the posts here (e.g. graph server issues) which are not on "g1" and therefore not flagged as instances of this bug?

For instance, the intermittent errors of CustomizableUI.jsm, line 1570 appear to not be limited to any specific talos test (according to bug 1124217 comment 0), and yet at least some of them are flagged as instances of this bug.

Just from the few issues which are mentioned at comment 138 and 139, we appear to have the following issues:

1. timeouts with (yet) unknown reason.
2. unreachable graph server (comment 132).
3. CustomizableUI.jsm, line 1570: TypeError
4. nsContextMenu.js, line 1523: SyntaxError (hopefully fixed at bug 1154225)

I can't tell if they're unique to g1 or not since this bug only has instances of them which happened with g1.

Maybe we should flag the issues differently, so we'll have better aggregation of failures by symptoms, which would hopefully improve our chances of figuring each of them out?
the last 5 errors are the other suite, not g1.  the sheriffs find a similar bug match and file a bug.  I do not know why you are interested in breaking this down for g1 specific- we have tons of warnings/errors from the browser that show up in the console, if there is a pattern that is actionable, we usually take action.

For example, the osx 10.6 and 10.10 failing horribly since April 8th- that is something that is reproducable and actionable :)
(In reply to Joel Maher (:jmaher) from comment #143)
> ... I do not know why you are interested in breaking
> this down for g1 specific

Because I wrote both tests of the g1 suit, and if the issue is with one of the g1 tests, then I want to fix it, but if the issue is unrelated to g1, then we should change this bug's title and I'll know that it's a generic issue.
And in general, better classification of symptoms (e.g. those I described at comment 142) is likely to be much more useful towards finding a solution compared to when the issues are classified as g1 problems.
Summary: Intermittent talos g1 "TalosError: timeout" → Intermittent talos "TalosError: timeout"
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
backlog: --- → webRTC+
Rank: 38
Priority: -- → P3
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)