Closed
Bug 1433270
Opened 7 years ago
Closed 7 years ago
Intermittent tp6_google | timeout
Categories
(Testing :: Talos, defect, P5)
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: intermittent-bug-filer, Unassigned)
References
Details
(Keywords: intermittent-failure, Whiteboard: [stockwell unknown])
Comment hidden (Intermittent Failures Robot) |
Comment 2•7 years ago
|
||
There have been 39 failures in the last 7 days.
This was filed on January 24th
This fails on Linux x64 / opt & pgo. There are some exceptions for linux64-qr (2) and OS X 10.10.
Here is a relevant log file and a snippet with the failure:
https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=158483519&lineNumber=1849
09:51:05 INFO - PID 16806 | 17255
1848
09:51:05 INFO - PID 16806 | ExceptionHandler::SendContinueSignalToChild sent continue signal to child
1849
09:51:05 INFO - TEST-UNEXPECTED-ERROR | tp6_google | timeout
1850
09:51:05 ERROR - Traceback (most recent call last):
1851
09:51:05 INFO - File "/builds/slave/test/build/tests/talos/talos/run_tests.py", line 289, in run_tests
1852
09:51:05 INFO - talos_results.add(mytest.runTest(browser_config, test))
1853
09:51:05 INFO - File "/builds/slave/test/build/tests/talos/talos/ttest.py", line 62, in runTest
1854
09:51:05 INFO - return self._runTest(browser_config, test_config, setup)
1855
09:51:05 INFO - File "/builds/slave/test/build/tests/talos/talos/ttest.py", line 214, in _runTest
1856
09:51:05 INFO - debugger_args=browser_config['debugger_args']
1857
09:51:05 INFO - File "/builds/slave/test/build/tests/talos/talos/talos_process.py", line 139, in run_browser
1858
09:51:05 INFO - raise TalosError("timeout")
1859
09:51:05 INFO - TalosError: timeout
1860
09:51:05 INFO - TEST-INFO took 3608353ms
1861
09:51:05 INFO - SUITE-END | took 3608s
:rwood could you please take a look?
Flags: needinfo?(rwood)
Whiteboard: [stockwell needswork]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 8•7 years ago
|
||
:rwood unfortunately the new hardware didn't reduce this failure rate- might be worth investigating in the short term.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 11•7 years ago
|
||
I really have *no idea* what is causing this. It looks like tp6_google gets through several tp page cycles successfully but then out of the blue (there's nothing obvious in the logs). I'll see if I can reproduce this locally, and if not I'll get a loaner and try it there.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 15•7 years ago
|
||
In the last 7 days we have 60 failures.
They occur mostly on windows10-64 (opt, pgo), Windows 7 (opt, pgo), OS X 10.10 (opt), Linux x64 (opt, pgo).
Failure log: https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-central&job_id=161508309&lineNumber=2459
04:23:47 INFO - PROCESS-CRASH | tp6_google | application crashed [unknown top frame]
04:23:47 INFO - Crash dump filename: c:\users\cltbld\appdata\local\temp\tmpfvxdld\profile\minidumps\bcef21a3-6ebd-4ee0-9885-f6858add5c38.dmp
04:23:47 INFO - stderr from minidump_stackwalk:
04:23:47 INFO - 2018-02-10 04:23:47: minidump.cc:4359: INFO: Minidump opened minidump c:\users\cltbld\appdata\local\temp\tmpfvxdld\profile\minidumps\bcef21a3-6ebd-4ee0-9885-f6858add5c38.dmp
04:23:47 INFO - 2018-02-10 04:23:47: minidump.cc:4808: ERROR: ReadBytes: read 0/32
04:23:47 INFO - 2018-02-10 04:23:47: minidump.cc:4453: ERROR: Minidump cannot read header
04:23:47 INFO - 2018-02-10 04:23:47: stackwalk.cc:133: ERROR: Minidump c:\users\cltbld\appdata\local\temp\tmpfvxdld\profile\minidumps\bcef21a3-6ebd-4ee0-9885-f6858add5c38.dmp could not be read
04:23:47 INFO - 2018-02-10 04:23:47: minidump.cc:4331: INFO: Minidump closing minidump
04:23:47 INFO - minidump_stackwalk exited with return code 1
04:23:47 INFO - TEST-UNEXPECTED-ERROR | tp6_google | Found crashes after test run, terminating test
Comment hidden (Intermittent Failures Robot) |
Comment 17•7 years ago
|
||
most likely a duplicate of 1378002 - hopefully resolved once we upgrade to new machines and fresh OS.
Comment hidden (Intermittent Failures Robot) |
Comment 19•7 years ago
|
||
Over the last 7 days this bug has 30 failures. These happen on Linux x64, linux64-nightly, OS X 10.10, Windows 7 and windows10-64.
Here is the most relevant log example: https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=163249283&lineNumber=1937
Here is a relevant part from that log:
10:25:04 INFO - PROCESS-CRASH | tp6_google | application crashed [unknown top frame]
10:25:04 INFO - Crash dump filename: c:\users\cltbld\appdata\local\temp\tmp85ghrn\profile\minidumps\cd2a3f5d-b99c-4e74-9b5f-497955b96277.dmp
10:25:04 INFO - stderr from minidump_stackwalk:
10:25:04 INFO - 2018-02-20 10:25:04: minidump.cc:4359: INFO: Minidump opened minidump c:\users\cltbld\appdata\local\temp\tmp85ghrn\profile\minidumps\cd2a3f5d-b99c-4e74-9b5f-497955b96277.dmp
10:25:04 INFO - 2018-02-20 10:25:04: minidump.cc:4808: ERROR: ReadBytes: read 0/32
10:25:04 INFO - 2018-02-20 10:25:04: minidump.cc:4453: ERROR: Minidump cannot read header
10:25:04 INFO - 2018-02-20 10:25:04: stackwalk.cc:133: ERROR: Minidump c:\users\cltbld\appdata\local\temp\tmp85ghrn\profile\minidumps\cd2a3f5d-b99c-4e74-9b5f-497955b96277.dmp could not be read
10:25:04 INFO - 2018-02-20 10:25:04: minidump.cc:4331: INFO: Minidump closing minidump
10:25:04 INFO - minidump_stackwalk exited with return code 1
10:25:04 INFO - TEST-UNEXPECTED-ERROR | tp6_google | Found crashes after test run, terminating test
10:25:04 ERROR - Traceback (most recent call last):
10:25:04 INFO - File "C:\slave\test\build\tests\talos\talos\run_tests.py", line 299, in run_tests
10:25:04 INFO - talos_results.add(mytest.runTest(browser_config, test))
10:25:04 INFO - File "C:\slave\test\build\tests\talos\talos\ttest.py", line 62, in runTest
10:25:04 INFO - return self._runTest(browser_config, test_config, setup)
10:25:04 INFO - File "C:\slave\test\build\tests\talos\talos\ttest.py", line 209, in _runTest
10:25:04 INFO - test_config['name'])
10:25:04 INFO - File "C:\slave\test\build\tests\talos\talos\ttest.py", line 46, in check_for_crashes
10:25:04 INFO - raise TalosCrash('Found crashes after test run, terminating test')
10:25:04 INFO - TalosCrash: Found crashes after test run, terminating test
10:25:04 INFO - TEST-INFO took 3633885ms
10:25:04 INFO - SUITE-END | took 3633s
WARNING | IO Completion Port failed to signal process shutdown
Parent process 5820 exited with children alive:
PIDS: 7040, 7148
Attempting to kill them, but no guarantee of success
10:28:09 ERROR - Return code: 2
10:28:09 WARNING - setting return code to 2
10:28:09 ERROR - # TBPL FAILURE #
Comment 20•7 years ago
|
||
:rwood- as this is not just windows- we should look into this. Possibly this is google specific and we stop that test, or we determine that it is toolchain related. Maybe some investigation in the short term.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 23•7 years ago
|
||
:rwood do you have any updates on this Bug?
Comment 24•7 years ago
|
||
I believe alot of these timeouts are the same tp6 issue as in Bug 1439979.
I have a couple of things running on try now to get more info / try to fix this.
https://bugzilla.mozilla.org/show_bug.cgi?id=1439979#c10
https://bugzilla.mozilla.org/show_bug.cgi?id=1439979#c11
Flags: needinfo?(rwood)
Comment hidden (Intermittent Failures Robot) |
Comment 27•7 years ago
|
||
there are 3 instances in the last 30 days, I think this bug is mostly fixed by the migration to the new hardware and taskcluster worker.
Whiteboard: [stockwell disable-recommended] → [stockwell unknown]
Comment hidden (Intermittent Failures Robot) |
Comment 29•7 years ago
|
||
Looks good - there have been zero instances of this since April 10th.
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(rwood)
Resolution: --- → WORKSFORME
Comment hidden (Intermittent Failures Robot) |
You need to log in
before you can comment on or make changes to this bug.
Description
•