Closed Bug 1565908 Opened 5 years ago Closed 5 years ago

Perma Android 8.0 Raptor-1proc(sp) [taskcluster:error] Task aborted - max run time exceeded

Categories

(Testing :: Raptor, defect, P1)

Version 3
defect

Tracking

(firefox-esr60 unaffected, firefox-esr68 unaffected, firefox68 unaffected, firefox69+ fixed, firefox70+ verified)

VERIFIED FIXED
mozilla70
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- unaffected
firefox68 --- unaffected
firefox69 + fixed
firefox70 + verified

People

(Reporter: RaulG, Assigned: Gijs)

References

(Regression)

Details

(Keywords: regression)

Attachments

(1 file)

[Tracking Requested - why for this release]:

Central as Beta simulation: https://treeherder.mozilla.org/#/jobs?repo=try&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Cusercancel%2Crunnable&revision=575daa939fac669c717c34864beac8565f42828e&selectedJob=256431800

Log link: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=256431800&repo=try&lineNumber=1256

Log snippet:

13:22:46 INFO - adb shell_output: adb -s FA83T1A00347 wait-for-device shell am start -W -n org.mozilla.fennec_aurora/org.mozilla.gecko.BrowserApp -a android.intent.action.VIEW --es args "-profile /sdcard/raptor/profile --es env0 LOG_VERBOSE=1 --es env1 R_LOG_LEVEL=6 --es env2 MOZ_WEBRENDER=0" -d about:blank; echo adb_returncode=$?, timeout: None, root: False, timedout: None, exitcode: 0, output: Starting: Intent { act=android.intent.action.VIEW dat=about:blank cmp=org.mozilla.fennec_aurora/org.mozilla.gecko.BrowserApp (has extras) }
13:22:46 INFO - Status: ok
13:22:46 INFO - Activity: org.mozilla.fennec_aurora/org.mozilla.gecko.BrowserApp
13:22:46 INFO - ThisTime: 303
13:22:46 INFO - TotalTime: 303
13:22:46 INFO - WaitTime: 308
13:22:46 INFO - Complete
13:22:46 INFO - adb shell_output: adb -s FA83T1A00347 wait-for-device shell pidof org.mozilla.fennec_aurora; echo adb_returncode=$?, timeout: None, root: False, timedout: None, exitcode: 0, out[taskcluster:error] Aborting task...
[taskcluster 2019-07-14T13:36:20.616Z] === Task Finished ===
[taskcluster 2019-07-14T13:36:20.616Z] Task Duration: 15m0.000470815s
[taskcluster 2019-07-14T13:36:21.329Z] Uploading artifact public/logs/localconfig.json from file workspace/logs/localconfig.json with content encoding "gzip", mime type "application/json" and expiry 2019-07-28T11:37:00.944Z
[taskcluster 2019-07-14T13:36:21.899Z] Uploading artifact public/test_info/logcat-FA83T1A00347.log from file workspace/build/blobber_upload_dir/logcat-FA83T1A00347.log with content encoding "gzip", mime type "text/plain" and expiry 2019-07-28T11:37:00.944Z
[taskcluster 2019-07-14T13:36:22.594Z] Uploading redirect artifact public/logs/live.log to URL https://queue.taskcluster.net/v1/task/YJ_55bJCRJe7BNrxUymKtg/runs/0/artifacts/public/logs/live_backing.log with mime type "text/plain; charset=utf-8" and expiry 2019-07-28T11:37:00.944Z
[taskcluster:error] Task aborted - max run time exceeded

Flags: needinfo?(gmierz2)
Priority: -- → P1
Summary: Perma Tier 2 Android opt Raptor [taskcluster:error] Task aborted - max run time exceeded → Perma Android 8.0 Raptor-1proc(sp) [taskcluster:error] Task aborted - max run time exceeded

The patch from bug 1565644 only changed which projects/branches the tests can run on. Since this test (speedometer) was previously running on the beta simulations (see before, and after), the patch could not have caused this issue. There is one issue from that patch which enabled android-hw-p2-8-0-arm7-api-16/pgo on mozilla-beta and it is being fixed in the bug 1566088 but this is unrelated to the speedometer issue of this bug.

Flags: needinfo?(gmierz2)

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression

(In reply to Sebastian Hengst [:aryx] (needinfo on intermittent or backout) from comment #4)

Bisection points to bug 1560178 as cause of this failure.

Last good: https://treeherder.mozilla.org/#/jobs?repo=try&revision=bca6fc67aab7b0d94f606c52cd26f6daae91f18d&selectedJob=256812559
First bad: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4d4ee360c70f0525f8f42ca581b08858346e7688

Regression range: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=4629e855ea33d85f8adcc6e76264ac779b799bd1&tochange=2236a18eb0a91be02e72c22e370a6f5aa877f11e

Robert, can you take this while Henrik is away?

Off-hand, this makes little sense - the code I added only does something in the parent process iff e10s is enabled. That should never be true on fennec. (I'm assuming this is indeed fennec and not geckoview, given the repeated mentions of fennec in the build log?)

Looking at the log for these jobs, the task seems... confused... about its e10s state. Excerpts (all from the one task!):

[taskcluster 2019-07-13T14:01:36.898Z] Executing command 0: '/builds/taskcluster/script.py' bash './test-linux.sh' --cfg 'mozharness/configs/raptor/android_hw_config.py' --test=raptor-speedometer-fennec --app=fennec '--binary=org.mozilla.fennec_aurora' --is-release-build --disable-e10s --download-symbols ondemand --test=raptor-speedometer-fennec --app=fennec '--binary=org.mozilla.fennec_aurora' --is-release-build --disable-e10s
14:02:53     INFO -  raptor-main Info: raptor config: {'binary': 'org.mozilla.fennec_aurora', 'local_profile_dir': '/tmp/tmp2JPcRV.mozrunner', 'symbols_path': 'https://queue.taskcluster.net/v1/task/Fabpd7BaQmedacJO_qIBMw/artifacts/public/build/en-UsS/target.crashreporter-symbols.zip', 'memory_test': False, 'cpu_test': False, 'enable_control_server_wait': False, 'e10s': True, 'app': 'fennec', 'gecko_profile_entries': None, 'power_test': False, 'run_local': False, 'platform': 'linux', 'host': '127.0.0.1', 'is_release_build': True, 'intent': None, 'enable_webrender': False, 'activity': None, 'gecko_profile_interval': None, 'processor': 'x86_64', 'gecko_profile': False, 'obj_path': None}

So it looks to me like the test is accidentally running fennec with the browser.tabs.remote.autostart pref set to true, and expecting it to not run in e10s mode. That's unlikely to go well - the code bug 1560178 isn't the only one that assumes that BrowserTabsRemoteAutoStart gives correct answers to "are we using e10s".

Does this only affect this particular build type (ie not the raptor tasks on other versions of android)? Perhaps it's just misconfigured?

The ones without the 1proc annotation are using geckoview, and the failing ones are using fennec (which doesn't support e10s). Trypush with some checks in raptor code to hopefully avoid this going forward:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=7da79995b892f917a14fe82eae0e8948e9ac811a

Try was green, so I've just put up the patch.

Flags: needinfo?(rwood)
Assignee: nobody → gijskruitbosch+bugs
Status: NEW → ASSIGNED

Tracking this for 69 since we want to uplift the regressing patch there too.

Pushed by gijskruitbosch@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/615c03bbe52c
ensure we never try to run with e10s enabled on fennec, r=perftest-reviewers,sparky
Blocks: 1567566
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla70
See Also: → 1567848

While we could uplift this to ESR68 also, we don't run raptor on there anyway, so meh.

Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: