The Windows Firefox UI updates were running for many hours

RESOLVED INVALID

Status

defect
RESOLVED INVALID
4 years ago
4 years ago

People

(Reporter: armenzg, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Some of the Windows jobs were running longer than 10+ hours.
Bug 1173976 should help/resolve this.
See Also: → 1173976
It won't necessarily help if there is still this mystic hang during the tests. This is really something I would like to know. Usually marionette should kill the testrun after at maximum 10min or so. If I could get some logs of such affected runs I can have a look at and maybe can find something.
The logs are in places like http://ftp.mozilla.org/pub/mozilla.org/firefox/candidates/41.0b2-candidates/build1/logs/ (that's the beta earlier this week), specifically the ones matching *_update_tests_*.txt.gz. I don't know if those jobs specifically have the slow windows tests in them.
So what I would need is clearly a log which contains such a hang. Maybe one of you can watch out for that? Also I thought the tests were disabled right now. Armen, is that not the case anymore?
Flags: needinfo?(armenzg)
Component: General Automation → Release Automation
QA Contact: catlee → bhearsum
They might have been disabled last week.
Let's not look more into this; we're moving to the testers so the set up is different.
It might even be that a Windows prompt is triggered. In general the testers are better suited to run tests and have more changes to prevent weird hangs like this.
Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(armenzg)
Resolution: --- → INVALID
I am seeing a few dozen ui_update_verify_beta jobs hanging for >8 hours. I killed all but the one below:

builder: http://buildbot-master91.bb.releng.usw2.mozilla.com:8001/builders/release-mozilla-beta-linux64_ui_update_verify_beta_1%2F6/builds/1
slave: bld-linux64-spot-1076

from my naive context, I wouldn't be surprised if error lines like "10:26:06     INFO -  No symbols path given, can't process dump." has something to do with https://bugzil.la/1170212
My bad. I should have asked for a Windows machine.
I was not aware that the issue was also happening on Linux machines.

In general, ScriptFactory should have a script time out by default [1].
After that, mozharness should kill or even marionette.

Where the timeout of the script is set for marionette:
https://dxr.mozilla.org/mozilla-central/source/testing/marionette/client/marionette/runner/base.py#363

I'm going to remove this, however, when I enable it on testers, I should make sure we kill this.

[1] http://hg.mozilla.org/build/buildbotcustom/file/default/process/factory.py#l4726
You need to log in before you can comment on or make changes to this bug.