Closed Bug 1784176 Opened 2 years ago Closed 1 year ago

Perma [Tier 2] Android AArch64 raptor-browsertime Critical: Browsertime process timed out after waiting 120 seconds for output

Categories

(Testing :: Raptor, defect, P5)

defect

Tracking

(firefox109 fixed, firefox110 fixed)

RESOLVED FIXED
110 Branch
Tracking Status
firefox109 --- fixed
firefox110 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: sparky)

References

Details

(Keywords: intermittent-failure, perf-alert, Whiteboard: [stockwell unknown])

Attachments

(2 files)

Filed by: apavel [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=386965722&repo=mozilla-beta
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/RTl-6tgPSey73GlESWBN4g/runs/0/artifacts/public/logs/live_backing.log


[task 2022-08-10T22:54:22.522Z] 22:51:28     INFO -  raptor-browsertime Info: Run tests on SM-A515F [R58R25003ZK] using Android version 11
[task 2022-08-10T22:54:22.522Z] 22:51:28     INFO -  raptor-browsertime Info: Running tests using Firefox - 15 iteration(s)
[task 2022-08-10T22:54:22.522Z] 22:51:28     INFO -  raptor-browsertime Info: Skip setting default preferences for Firefox
[task 2022-08-10T22:54:22.522Z] 22:52:08     INFO -  raptor-browsertime Info: Wed, 10 Aug 2022 22:52:08 GMT adb:command Send '001Ahost:transport:R58R25003ZK'
[task 2022-08-10T22:54:22.522Z] 22:52:08     INFO -  raptor-browsertime Info: Wed, 10 Aug 2022 22:52:08 GMT adb:command Send '0040shell:dumpsys battery | grep temperature | grep -Eo '[0-9]{1,3}''
[task 2022-08-10T22:54:22.522Z] 22:52:08     INFO -  raptor-browsertime Info: Starting a browsertime pageload
[task 2022-08-10T22:54:22.522Z] 22:52:08     INFO -  raptor-browsertime Info: Waiting for 1000 ms (post_startup_delay)
[task 2022-08-10T22:54:22.522Z] 22:52:09     INFO -  raptor-browsertime Info: Navigating to about:blank, count: 0
[task 2022-08-10T22:54:22.522Z] 22:52:09     INFO -  raptor-browsertime Info: Navigating to url about:blank iteration 1
[task 2022-08-10T22:54:22.522Z] 22:52:14     INFO -  raptor-browsertime Info: Navigating to primary url:https://www.instagram.com
[task 2022-08-10T22:54:22.522Z] 22:52:14     INFO -  raptor-browsertime Info: Cycle 0, waiting for 1000 ms
[task 2022-08-10T22:54:22.522Z] 22:52:15     INFO -  raptor-browsertime Info: Cycle 0, starting the measure
[task 2022-08-10T22:54:22.522Z] 22:52:15     INFO -  raptor-browsertime Info: Testing url https://www.instagram.com iteration 1
[task 2022-08-10T22:54:22.522Z] 22:52:16     INFO -  raptor-browsertime Info: Wed, 10 Aug 2022 22:52:16 GMT adb:command Send '001Ahost:transport:R58R25003ZK'
[task 2022-08-10T22:54:22.522Z] 22:52:16     INFO -  raptor-browsertime Info: Wed, 10 Aug 2022 22:52:16 GMT adb:command Send '003Dshell:screenrecord --bit-rate 8000000 /sdcard/browsertime.mp4'
[task 2022-08-10T22:54:22.522Z] 22:54:16 CRITICAL -  raptor-browsertime Critical: Browsertime process timed out after waiting 120 seconds for output
[task 2022-08-10T22:54:22.522Z] 22:54:16     INFO -  raptor-browsertime-android Info: removing reverse socket connections
[task 2022-08-10T22:54:22.522Z] 22:54:16     INFO -  adb command_output: adb -s R58R25003ZK wait-for-device reverse --remove-all, timeout: None, timedout: None, exitcode: 0, output:
[task 2022-08-10T22:54:22.522Z] 22:54:16     INFO -  adb shell_bool: adb -s R58R25003ZK wait-for-device shell su -c 'test -d /sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps', timeout: None, timedout: None, exitcode: 0, output:
[task 2022-08-10T22:54:22.522Z] 22:54:16     INFO -  adb shell_output: adb -s R58R25003ZK wait-for-device shell su -c sync, timeout: None, timedout: None, exitcode: 0, output:
[task 2022-08-10T22:54:22.522Z] 22:54:16     INFO -  adb shell_bool: adb -s R58R25003ZK wait-for-device shell su -c 'test -d /sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps', timeout: None, timedout: None, exitcode: 0, output:
[task 2022-08-10T22:54:22.522Z] 22:54:16     INFO -  adb chmod: path=/sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps, recursive=True, mask=777
[task 2022-08-10T22:54:22.522Z] 22:54:16     INFO -  adb Ignoring attempt to chmod external storage
[task 2022-08-10T22:54:22.522Z] 22:54:16     INFO -  adb command_output: adb -s R58R25003ZK wait-for-device pull /sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps /tmp/tmpndyg5yjo/minidumps, timeout: None, timedout: None, exitcode: 0, output: /sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps/: 0 files pulled, 0 skipped.
[task 2022-08-10T22:54:22.522Z] 22:54:16     INFO -  raptor-mitmproxy Info: Mitmproxy stop!!
[task 2022-08-10T22:54:22.522Z] 22:54:16     INFO -  raptor-mitmproxy Info: Stopping mitmproxy playback, killing process 1472
[task 2022-08-10T22:54:22.522Z] 22:54:17     INFO -  raptor-mitmproxy Info: Successfully killed the mitmproxy playback process
[task 2022-08-10T22:54:22.522Z] 22:54:17     INFO -  raptor-perftest Info: Removing temporary directory: /tmp/tmpm9a1fm83
[task 2022-08-10T22:54:22.522Z] 22:54:17    ERROR -  Traceback (most recent call last):
[task 2022-08-10T22:54:22.522Z] 22:54:17     INFO -    File "/builds/task_166017153318107/workspace/build/tests/raptor/raptor/raptor.py", line 205, in <module>
[task 2022-08-10T22:54:22.522Z] 22:54:17     INFO -      main()
[task 2022-08-10T22:54:22.522Z] 22:54:17     INFO -    File "/builds/task_166017153318107/workspace/build/tests/raptor/raptor/raptor.py", line 151, in main
[task 2022-08-10T22:54:22.522Z] 22:54:17     INFO -      success = raptor.run_tests(raptor_test_list, raptor_test_names)
[task 2022-08-10T22:54:22.522Z] 22:54:17     INFO -    File "/builds/task_166017153318107/workspace/build/tests/raptor/raptor/browsertime/android.py", line 243, in run_tests
[task 2022-08-10T22:54:22.522Z] 22:54:17     INFO -      return super(BrowsertimeAndroid, self).run_tests(tests, test_names)
[task 2022-08-10T22:54:22.522Z] 22:54:17     INFO -    File "/builds/task_166017153318107/workspace/build/tests/raptor/raptor/perftest.py", line 465, in run_tests
[task 2022-08-10T22:54:22.522Z] 22:54:17     INFO -      self.run_test(test, timeout=int(test.get("page_timeout")))
[task 2022-08-10T22:54:22.522Z] 22:54:17     INFO -    File "/builds/task_166017153318107/workspace/build/tests/raptor/raptor/browsertime/base.py", line 707, in run_test
[task 2022-08-10T22:54:22.522Z] 22:54:17     INFO -      f"Browsertime process timed out after waiting {output_timeout} seconds "
[task 2022-08-10T22:54:22.522Z] 22:54:17     INFO -  Exception: Browsertime process timed out after waiting 120 seconds for output
[task 2022-08-10T22:54:22.522Z] 22:54:18    ERROR - Return code: 1
[task 2022-08-10T22:54:22.522Z] 22:54:18  WARNING - setting return code to 1
[task 2022-08-10T22:54:22.523Z] 22:54:18     INFO - Killing logcat pid 976.
[task 2022-08-10T22:54:22.523Z] 22:54:18     INFO - Copying Raptor results to upload dir:
[task 2022-08-10T22:54:22.523Z] 22:54:18     INFO - /builds/task_166017153318107/workspace/build/blobber_upload_dir/perfherder-data.json
[task 2022-08-10T22:54:22.523Z] 22:54:18     INFO - Copying raptor results from /builds/task_166017153318107/workspace/build/raptor.json to /builds/task_166017153318107/workspace/build/blobber_upload_dir/perfherder-data.json
[task 2022-08-10T22:54:22.523Z] 22:54:18 CRITICAL - Error copying results /builds/task_166017153318107/workspace/build/raptor.json to upload dir /builds/task_166017153318107/workspace/build/blobber_upload_dir/perfherder-data.json
[task 2022-08-10T22:54:22.523Z] 22:54:18     INFO - [Errno 2] No such file or directory: '/builds/task_166017153318107/workspace/build/raptor.json'
[task 2022-08-10T22:54:22.523Z] 22:54:18     INFO - Running post-action listener: _package_coverage_data
[task 2022-08-10T22:54:22.523Z] 22:54:18     INFO - Running post-action listener: _resource_record_post_action
[task 2022-08-10T22:54:22.523Z] 22:54:18     INFO - Running post-action listener: process_java_coverage_data
[task 2022-08-10T22:54:22.523Z] 22:54:18     INFO - Running post-action listener: stop_device
[task 2022-08-10T22:54:22.523Z] 22:54:20     INFO - Killing logcat pid 976.
See Also: → 1769482
Summary: Intermittent Android 11.0 AArch64 raptor-browsertime Critical: Browsertime process timed out after waiting 120 seconds for output → Intermittent Android AArch64 raptor-browsertime Critical: Browsertime process timed out after waiting 120 seconds for output

There have been 38 total failures in the last 7 days, recent failure log.
Affected platforms are:

  • android-hw-a51-11-0-aarch64-shippable-qr
  • android-hw-p2-8-0-android-aarch64-shippable-qr
  • linux1804-64-shippable-qr
  • macosx1015-64-shippable-qr
[task 2022-10-02T00:20:30.448Z] 00:18:23     INFO -  raptor-browsertime Info: Navigating to about:blank, count: 1
[task 2022-10-02T00:20:30.448Z] 00:18:23     INFO -  raptor-browsertime Info: Navigating to url about:blank iteration 4
[task 2022-10-02T00:20:30.448Z] 00:18:29     INFO -  raptor-browsertime Info: Navigating to primary url:https://web.de/magazine/politik/politologe-glaubt-grossen-koalition-herbst-knallen-33563566
[task 2022-10-02T00:20:30.448Z] 00:18:29     INFO -  raptor-browsertime Info: Cycle 1, waiting for 1000 ms
[task 2022-10-02T00:20:30.448Z] 00:18:30     INFO -  raptor-browsertime Info: Cycle 1, starting the measure
[task 2022-10-02T00:20:30.448Z] 00:18:30     INFO -  raptor-browsertime Info: Testing url https://web.de/magazine/politik/politologe-glaubt-grossen-koalition-herbst-knallen-33563566 iteration 4
[task 2022-10-02T00:20:30.448Z] 00:18:30     INFO -  raptor-browsertime Info: Sun, 02 Oct 2022 00:18:30 GMT adb:command Send '001Bhost:transport:HT7AR1A03631'
[task 2022-10-02T00:20:30.448Z] 00:18:30     INFO -  raptor-browsertime Info: Sun, 02 Oct 2022 00:18:30 GMT adb:command Send '003Dshell:screenrecord --bit-rate 8000000 /sdcard/browsertime.mp4'
[task 2022-10-02T00:20:40.870Z] 00:20:30 CRITICAL -  raptor-browsertime Critical: Browsertime process timed out after waiting 120 seconds for output
[task 2022-10-02T00:20:40.870Z] 00:20:30     INFO -  raptor-browsertime-android Info: removing reverse socket connections
[task 2022-10-02T00:20:40.870Z] 00:20:30     INFO -  adb command_output: adb -s HT7AR1A03631 wait-for-device reverse --remove-all, timeout: None, timedout: None, exitcode: 0, output:
[task 2022-10-02T00:20:40.870Z] 00:20:30     INFO -  adb shell_bool: adb -s HT7AR1A03631 wait-for-device shell test -d /sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps, timeout: None, timedout: None, exitcode: 0, output:
[task 2022-10-02T00:20:40.870Z] 00:20:30     INFO -  adb shell_output: adb -s HT7AR1A03631 wait-for-device shell su -c sync, timeout: None, timedout: None, exitcode: 0, output:
[task 2022-10-02T00:20:40.870Z] 00:20:30     INFO -  adb shell_bool: adb -s HT7AR1A03631 wait-for-device shell test -d /sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps, timeout: None, timedout: None, exitcode: 0, output:
[task 2022-10-02T00:20:40.870Z] 00:20:30     INFO -  adb chmod: path=/sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps, recursive=True, mask=777
[task 2022-10-02T00:20:40.870Z] 00:20:30     INFO -  adb Ignoring attempt to chmod external storage
[task 2022-10-02T00:20:40.870Z] 00:20:30     INFO -  adb command_output: adb -s HT7AR1A03631 wait-for-device pull /sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps /tmp/tmpviflemj0/minidumps, timeout: None, timedout: None, exitcode: 0, output: /sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps/: 0 files pulled, 0 skipped.
[task 2022-10-02T00:20:40.870Z] 00:20:30     INFO -  raptor-mitmproxy Info: Mitmproxy stop!!
Flags: needinfo?(dave.hunt)
Whiteboard: [stockwell needswork:owner]

It appears that there's a failure within browsertime, preventing stopVideo from being called. Perhaps we could run with additional debug on try to narrow down where the issue is. Bouncing needinfo to @sparky, who is much more familiar with this code.

Flags: needinfo?(dave.hunt) → needinfo?(gmierz2)

The logs seem to show that the test is working fine and that we're not actually timing out, it's just taking a lot of time to get the video sometimes.

Here we're waiting for 2 minutes:

[task 2022-10-06T12:14:40.349Z] 12:12:39     INFO -  raptor-browsertime Info: Thu, 06 Oct 2022 12:12:39 GMT adb:command Send '003Dshell:screenrecord --bit-rate 8000000 /sdcard/browsertime.mp4'
[task 2022-10-06T12:14:40.349Z] 12:14:39 CRITICAL -  raptor-browsertime Critical: Browsertime process timed out after waiting 120 seconds for output
[task 2022-10-06T12:14:40.349Z] 12:14:39     INFO -  raptor-browsertime-android Info: removing reverse socket connections
[task 2022-10-06T12:14:40.349Z] 12:14:39     INFO -  adb command_output: adb -s FA79D1A07331 wait-for-device reverse --remove-all, timeout: None, timedout: None, exitcode: 0, output:
[task 2022-10-06T12:14:40.349Z] 12:14:39     INFO -  adb shell_bool: adb -s FA79D1A07331 wait-for-device shell test -d /sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps, timeout: None, timedout: None, exitcode: 0, output:
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -  adb shell_output: adb -s FA79D1A07331 wait-for-device shell su -c sync, timeout: None, timedout: None, exitcode: 0, output:
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -  adb shell_bool: adb -s FA79D1A07331 wait-for-device shell test -d /sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps, timeout: None, timedout: None, exitcode: 0, output:
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -  adb chmod: path=/sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps, recursive=True, mask=777
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -  adb Ignoring attempt to chmod external storage
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -  adb command_output: adb -s FA79D1A07331 wait-for-device pull /sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps /tmp/tmp2mpjyq1h/minidumps, timeout: None, timedout: None, exitcode: 0, output: /sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps/: 0 files pulled, 0 skipped.
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -  raptor-mitmproxy Info: Mitmproxy stop!!
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -  raptor-mitmproxy Info: Stopping mitmproxy playback, killing process 1181
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -  raptor-mitmproxy Info: Successfully killed the mitmproxy playback process
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -  raptor-perftest Info: Removing temporary directory: /tmp/tmpf51vf9pv
[task 2022-10-06T12:14:50.017Z] 12:14:40    ERROR -  Traceback (most recent call last):
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -    File "/builds/task_166505624830935/workspace/build/tests/raptor/raptor/raptor.py", line 206, in <module>
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -      main()
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -    File "/builds/task_166505624830935/workspace/build/tests/raptor/raptor/raptor.py", line 152, in main
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -      success = raptor.run_tests(raptor_test_list, raptor_test_names)
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -    File "/builds/task_166505624830935/workspace/build/tests/raptor/raptor/browsertime/android.py", line 243, in run_tests
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -      return super(BrowsertimeAndroid, self).run_tests(tests, test_names)
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -    File "/builds/task_166505624830935/workspace/build/tests/raptor/raptor/perftest.py", line 467, in run_tests
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -      self.run_test(test, timeout=int(test.get("page_timeout")))
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -    File "/builds/task_166505624830935/workspace/build/tests/raptor/raptor/browsertime/base.py", line 791, in run_test
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -      raise Exception(
[task 2022-10-06T12:14:50.017Z] 12:14:40     INFO -  Exception: Browsertime process timed out after waiting 120 seconds for output

While in the logcat, we see things happening during those two minutes with a screen record success at the end:

10-06 12:14:40.121   882 12881 W GraphicBufferSource: releasing an unpopulated slot: 58
10-06 12:14:40.121   882 12881 W GraphicBufferSource: releasing an unpopulated slot: 59
10-06 12:14:40.121   882 12881 W GraphicBufferSource: releasing an unpopulated slot: 60
10-06 12:14:40.121   882 12881 W GraphicBufferSource: releasing an unpopulated slot: 61
10-06 12:14:40.121   882 12881 W GraphicBufferSource: releasing an unpopulated slot: 62
10-06 12:14:40.121   882 12881 W GraphicBufferSource: releasing an unpopulated slot: 63
10-06 12:14:40.121 30484 30484 D MPEG4Writer: Video track stopping. Stop source
10-06 12:14:40.121 30484 30484 D MPEG4Writer: Video track source stopping
10-06 12:14:40.121 30484 30484 D MPEG4Writer: Video track source stopped
10-06 12:14:40.121 30484 30496 I MPEG4Writer: Received total/0-length (23/0) buffers and encoded 23 frames. - Video
10-06 12:14:40.121 30484 30484 D MPEG4Writer: Video track stopped. Stop source
10-06 12:14:40.121 30484 30484 D MPEG4Writer: Stopping writer thread
10-06 12:14:40.122 30484 30495 D MPEG4Writer: 0 chunks are written in the last batch
10-06 12:14:40.122 30484 30484 D MPEG4Writer: Writer thread stopped
10-06 12:14:40.122 30484 30484 I MPEG4Writer: Ajust the moov start time from 2238749206 us -> 2238749206 us
10-06 12:14:40.123   882 12881 I OMX-VENC: Component Deinit
10-06 12:14:40.124   882 12881 I OMX-VENC: Video encode perflock released
10-06 12:14:40.125 30484 30484 D MPEG4Writer: Video track stopping. Stop source
10-06 12:14:40.132   604   604 D SurfaceFlinger: Backpressure trigger, skipping transaction & refresh!
10-06 12:14:40.142  7619  7619 D MediaScannerReceiver: action: android.intent.action.MEDIA_SCANNER_SCAN_FILE path: /storage/emulated/0/browsertime.mp4
10-06 12:14:40.150 30484 30484 D ScreenRecord: success
10-06 12:14:40.173   810   810 E android.hardware.power@1.1-service.wahoo: Governor not supported by powerHAL, skipping

:kshampur, could you see if increasing the output timeout helps with this issue on the web-de task?

Flags: needinfo?(gmierz2) → needinfo?(kshampur)

Update:

There have been 30 failures in the last 7 days:

  • 12 failures on Android 11.0 Samsung A51 Shippable AArch64 opt
  • 12 failures on Android 8.0 Pixel2 AArch64 Shippable WebRender opt
  • 1 failure on Linux 18.04 x64 WebRender Shippable opt
  • 5 failures on OS X 10.15 WebRender Shippable op

Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=392744725&repo=mozilla-central&lineNumber=2534

:kshampur, sounds good, I found this in the logcat logs:

10-08 18:04:55.343 31171 31196 I Gecko   : 1665252295343	Marionette	DEBUG	0 -> [0,68,"WebDriver:ExecuteScript",{"args":[5000],"script":"\nreturn (function(waitTime) {\n    try { \n            var end = window.performance.timing.loadEventEnd;\n            var start= ... mance.now() > end - start + waitTime);\n    } \n    catch(e) {\n        return true;\n    }\n})(arguments[arguments.length - 1]);\n"}]

It is constantly repeating too up until the end of the test. It suggests to me that the website might be having a hard time finishing loading. This is the code that is trying to run: https://github.com/sitespeedio/browsertime/blob/9004781fda367edcbd7285723949a02c23ec998f/lib/core/pageCompleteChecks/defaultPageCompleteCheck.js#L3-L12

quick update - not sure why this seems to only be affecting Btime web-de, but not Btime-nofis nor Btime-live-nofis in m-c.
Looking at the artifacts in the failed jobs, the mp4's and jpegs seem to be fine for the successful iterations leading up to the failure. (which likely seems to be that the webpage is just failing to load)

also maybe worth noting, that both the output_timeout and proc_timeout are reached, not just the output_timeout i.e. https://searchfox.org/mozilla-central/source/testing/raptor/raptor/browsertime/base.py#790,795

:sparky - Would it be worthwhile to re-record web-de? I see that has been a common approach in the past for similar issues (e.g. Bug 1769482, Bug 1642045)
If so, is it possible to re-record web-de on CI like we do with desktop perftest recording? (Didn't look like perftest-android-hw-p2-record-websites does what I thought it would do)

Flags: needinfo?(kshampur) → needinfo?(gmierz2)

:kshampur, sounds good regarding a re-recording. I think we're able to IIRC, and the task you mentioned should be able to record it.

Flags: needinfo?(gmierz2)
Whiteboard: [stockwell disable-recommended] → [stockwell needswork:owner]
Whiteboard: [stockwell disable-recommended] → [stockwell needswork:owner]

There have been 39 total failures in the last 7 days, recent failure log.
Affected platforms are:

  • android-hw-a51-11-0-aarch64-shippable-qr
  • android-hw-p2-8-0-android-aarch64-shippable-qr
  • linux1804-64-shippable-qr

Ah so this is where the segmentation failures ended up :)

These failures have the same crash in the logs as what we had in: https://bugzilla.mozilla.org/show_bug.cgi?id=1777710
The failure signature is not a red-herring now, it's legitimate in that something is not responsive.

12-02 19:00:25.708 26285  2507 D GeckoThread: State changed to EXITING
12-02 19:00:25.713 26285  2507 I Gecko   : 1670007625713	Marionette	DEBUG	0 <- [1,109,null,{"cause":"shutdown","forced":false,"in_app":true}]
12-02 19:00:25.725 26285  2507 D GeckoViewNavigation: onDisable
12-02 19:00:25.726  2520  2568 D GeckoViewSelectionActionDelegate[C]: handleEvent: visibilitychange
12-02 19:00:25.727 26285  2507 D GeckoViewProcessHangMonitor: onDisable
12-02 19:00:25.727 26285  2507 D GeckoViewModule: unregisterListener
12-02 19:00:25.728 26285  2507 D GeckoViewProcessHangMonitor: onDestroy
12-02 19:00:25.729  2520  2568 D GeckoViewClipboardPermissionChild[C]: handleEvent: deactivate
12-02 19:00:25.729 26285  2507 D GeckoViewProgress: onDisable
12-02 19:00:25.730 26285  2507 D GeckoViewModule: unregisterListener
12-02 19:00:25.732 26285  2507 D GeckoViewSelectionAction: onDisable
12-02 19:00:25.732 26285  2507 D GeckoViewModule: unregisterListener
12-02 19:00:25.733 26285  2507 D GeckoViewModule: unregisterListener
12-02 19:00:25.750  2520  2568 D GeckoViewSelectionActionDelegate[C]: handleEvent: visibilitychange
12-02 19:00:25.751  2520  2568 D GeckoViewClipboardPermissionChild[C]: handleEvent: pagehide
12-02 19:00:25.752  2520  2568 D GeckoViewAutoFill[C]: handleEvent: pagehide
12-02 19:00:25.761  2520  2568 D GeckoViewSelectionActionDelegate[C]: handleEvent: visibilitychange
12-02 19:00:25.763  2520  2568 D GeckoViewClipboardPermissionChild[C]: handleEvent: pagehide
12-02 19:00:25.765  2520  2568 D GeckoViewAutoFill[C]: handleEvent: pagehide
12-02 19:00:25.786 26285  2507 W google-breakpad: ExceptionHandler::GenerateDump cloned child 
12-02 19:00:25.786 26285  2507 W google-breakpad: 2858
12-02 19:00:25.786 26285  2507 W google-breakpad: 
12-02 19:00:25.786 26285  2507 W google-breakpad: ExceptionHandler::SendContinueSignalToChild sent continue signal to child
12-02 19:00:25.787  2858  2507 W google-breakpad: ExceptionHandler::WaitForContinueSignal waiting for continue signal...
12-02 19:00:26.307  4310  4310 I io_stats: !@   8,0 r 315011 5386172 w 19382 595772 d 11780 250176 f 10796 13519 iot 66128 61128 th 102400 0 0 pt 0 inp 0 0 926.165
12-02 19:00:26.376  4934  5445 D BatteryService: Sending ACTION_BATTERY_CHANGED. scale:100, info:{.chargerAcOnline = false, .chargerUsbOnline = true, .chargerWirelessOnline = false, .maxChargingCurrent = 0, .maxChargingVoltage = 0, .batteryStatus = FULL, .batteryHealth = GOOD, .batteryPresent = true, .batteryLevel = 100, .batteryVoltage = 4325, .batteryTemperature = 295, .batteryCurrent = 45, .batteryCycleCount = 0, .batteryFullCharge = 4000000, .batteryChargeCounter = 3801000, .batteryTechnology = Li-ion}
12-02 19:00:26.377  4934  5445 D BatteryService: online:4, current avg:57, charge type:0, POGO powered:false, power sharing:false, high voltage charger:false, charger_type:0, capacity:280000, misc_event:0x10000, current_event:0x0, current_now:45
12-02 19:00:26.380  4934  4934 I MotionRecognitionService: Plugged
12-02 19:00:26.380  4934  4934 D MotionRecognitionService:   mCableConnection= 1
12-02 19:00:26.380  4934  4934 D MotionRecognitionService: set cable connected : mGripEnabled = false
12-02 19:00:26.381  4934  4934 E PocketModeEvent: PocketModeSetting is off 
12-02 19:00:26.381  4934  4934 D PhoneWindowManagerExt: ACTION_BATTERY_CHANGED - Level=100, status=5
12-02 19:00:26.382  4934  4934 D SemWifiApBroadcastReceiver: Received : android.intent.action.BATTERY_CHANGED
12-02 19:00:26.385  5255  5255 D SemBigDataInfoController: Intent : android.intent.action.BATTERY_CHANGED
12-02 19:00:26.386  5277  5277 D KeyguardUpdateMonitor: received broadcast android.intent.action.BATTERY_CHANGED
12-02 19:00:26.387  5277  5277 D KeyguardUpdateMonitor: handleBatteryUpdate
12-02 19:00:26.390  5255  5255 D SemBigDataInfoController: Intent : android.intent.action.BATTERY_CHANGED
12-02 19:00:26.390  5277  5277 I PowerUI : BATTERY_HEALTH_CHECK extraHealth=2 miscEvent=65536
12-02 19:00:26.390  5277  5277 D PowerUI : priorPlugType = 2 mPlugType = 2 priorBatteryStatus = 5 mBatteryStatus = 5
12-02 19:00:26.390  5277  5277 D PowerUI : Battery swelling mode - priorBatterySwellingMode = 0 mBatterySwellingMode = 0 mBatteryStatus = 5
12-02 19:00:26.390  5277  5277 I PowerUI : misc_event = 65536
12-02 19:00:26.411  5277  2840 E IndicatorGardenAlgorithmBasicCutout: NOT MATCH !!!! resourceHeight:87, cutoutHeight:88 ([IndicatorGardenInputProperty]  Rotation(0-0,90-1,180-2,270-3)0, Density:2.625, ScreenWidthSize:1080, CoverSidePadding:0, mIndicatorGardenCenterOffset:11, mCameraCutoutCropSize:0, mGardenPaddingStart:21, mIndicatorCornerPadding:72, mCameraSidePadding:36, mCameraTopMargin:33, DpCutout:DisplayCutout{insets=Rect(0, 88 - 0, 0) waterfall=Insets{left=0, top=0, right=0, bottom=0} boundingRect={Bounds=[Rect(0, 0 - 0, 0), Rect(512, 0 - 568, 88), Rect(0, 0 - 0, 0), Rect(0, 0 - 0, 0)]}}) 
12-02 19:00:26.411  5277  2840 E IndicatorGardenAlgorithmBasicCutout: NOT MATCH !!!! resourceHeight:87, cutoutHeight:88 ([IndicatorGardenInputProperty]  Rotation(0-0,90-1,180-2,270-3)0, Density:2.625, ScreenWidthSize:1080, CoverSidePadding:0, mIndicatorGardenCenterOffset:11, mCameraCutoutCropSize:0, mGardenPaddingStart:21, mIndicatorCornerPadding:72, mCameraSidePadding:36, mCameraTopMargin:33, DpCutout:DisplayCutout{insets=Rect(0, 88 - 0, 0) waterfall=Insets{left=0, top=0, right=0, bottom=0} boundingRect={Bounds=[Rect(0, 0 - 0, 0), Rect(512, 0 - 568, 88), Rect(0, 0 - 0, 0), Rect(0, 0 - 0, 0)]}}) 
12-02 19:00:26.718  4461  4461 E audit   : type=1701 audit(1670007626.715:1757): auid=4294967295 uid=10266 gid=10266 ses=4294967295 subj=u:r:untrusted_app:s0:c10,c257,c512,c768 pid=26285 comm="Gecko" exe="/system/bin/app_process64" sig=11 res=1
12-02 19:00:26.802  2520  2647 I Gecko   : Exiting due to channel error.
12-02 19:00:26.802  4934  5093 W InputDispatcher: channel 'ad89fe8 org.mozilla.geckoview_example/org.mozilla.geckoview_example.GeckoViewActivity (server)' ~ Consumer closed input channel or an error occurred.  events=0x9, fd=656
12-02 19:00:26.803  4934  5093 E InputDispatcher: channel 'ad89fe8 org.mozilla.geckoview_example/org.mozilla.geckoview_example.GeckoViewActivity (server)' ~ Channel is unrecoverably broken and will be disposed!
12-02 19:00:26.810  4934  4947 D InputTransport: Input channel destroyed: 'ClientS', fd=408
12-02 19:00:26.810  4934  7356 I ActivityManager: Process org.mozilla.geckoview_example (pid 26285) has died: fg  TOP (252,1045)
12-02 19:00:26.810  4495  4495 I Zygote  : Process 26285 exited due to signal 11 (Segmentation fault)

(:cpeterson, I've cc'ed you so you know where these failures are now. It's the same as what we were seeing in bug 1777710 but now the failure signature is more accurate).

The instagram test is a perma-failure on GVE, and Fenix. It looks like there are other permas on fenix too: https://treeherder.mozilla.org/jobs?repo=fenix&selectedTaskRun=Pya5qe8lQpS-wNomPahH0w.0&revision=fb5e474273ff7e66196df494fad228e6c70d580f

The instagram test is also permafailing on central but it's not the same issue as you mentioned in the comments above(at least I don't think so). Now it fails where it was previously green: backfill range.
Failure log.
Hi Greg! Can you please take a look at this?
Thank you!

[task 2022-12-09T23:45:56.380Z] 23:41:54     INFO -  raptor-browsertime Info: Cycle 1, starting the measure
[task 2022-12-09T23:45:56.380Z] 23:41:54     INFO -  raptor-browsertime Info: Testing url https://www.instagram.com iteration 1
[task 2022-12-09T23:45:56.380Z] 23:41:54     INFO -  raptor-browsertime Info: Fri, 09 Dec 2022 23:41:54 GMT adb:command Send '001Ahost:transport:R58R36P8E0D'
[task 2022-12-09T23:45:56.380Z] 23:41:54     INFO -  raptor-browsertime Info: Fri, 09 Dec 2022 23:41:54 GMT adb:command Send '003Dshell:screenrecord --bit-rate 8000000 /sdcard/browsertime.mp4'
[task 2022-12-09T23:45:56.380Z] 23:43:00     INFO -  raptor-browsertime Info: https://www.instagram.com has been tested before within the same run, it will get an extra query parameter named browsertime_run. Make sure to use alias to keep track of the URLs
[task 2022-12-09T23:45:56.380Z] 23:43:00     INFO -  raptor-browsertime Info: Fri, 09 Dec 2022 23:43:00 GMT adb:command Send '001Ahost:transport:R58R36P8E0D'
[task 2022-12-09T23:45:56.380Z] 23:43:00     INFO -  raptor-browsertime Info: Fri, 09 Dec 2022 23:43:00 GMT adb:command Send '0088shell:command -v pkill >/dev/null && pkill -l SIGINT screenrecord || kill -2 $(ps screenrecord | grep -Eo [0-9]+ | grep -m 1 -Eo [0-9]+)'
[task 2022-12-09T23:45:56.380Z] 23:43:20     INFO -  raptor-browsertime Info: Fri, 09 Dec 2022 23:43:20 GMT adb:command Send '001Ahost:transport:R58R36P8E0D'
[task 2022-12-09T23:45:56.380Z] 23:43:20     INFO -  raptor-browsertime Info: Fri, 09 Dec 2022 23:43:20 GMT adb:command Send '0005sync:'
[task 2022-12-09T23:45:56.380Z] 23:43:20     INFO -  raptor-browsertime Info: Fri, 09 Dec 2022 23:43:20 GMT adb:sync RECV /sdcard/browsertime.mp4
[task 2022-12-09T23:45:56.380Z] 23:43:20     INFO -  raptor-browsertime Info: Fri, 09 Dec 2022 23:43:20 GMT adb:command Send '001Ahost:transport:R58R36P8E0D'
[task 2022-12-09T23:45:56.380Z] 23:43:20     INFO -  raptor-browsertime Info: Fri, 09 Dec 2022 23:43:20 GMT adb:command Send '002Dshell:rm -- "/sdcard//sdcard/browsertime.mp4"'
[task 2022-12-09T23:45:56.380Z] 23:43:20     INFO -  raptor-browsertime Info: CPU / Power usage: 11218
[task 2022-12-09T23:45:56.380Z] 23:43:21     INFO -  raptor-browsertime Info: Browsertime pageload ended.
[task 2022-12-09T23:45:56.380Z] 23:43:24     INFO -  raptor-browsertime Info: Fri, 09 Dec 2022 23:43:24 GMT adb:command Send '001Ahost:transport:R58R36P8E0D'
[task 2022-12-09T23:45:56.380Z] 23:43:24     INFO -  raptor-browsertime Info: Fri, 09 Dec 2022 23:43:24 GMT adb:command Send '0040shell:dumpsys battery | grep temperature | grep -Eo '[0-9]{1,3}''
[task 2022-12-09T23:45:56.381Z] 23:43:24     INFO -  raptor-browsertime Info: Use the visual metrics portable script
[task 2022-12-09T23:45:56.381Z] 23:43:24     INFO -  raptor-browsertime Info: Get visual metrics from the video
[task 2022-12-09T23:45:56.381Z] 23:43:47     INFO -  raptor-browsertime Info: Converting video to 60 fps
[task 2022-12-09T23:45:56.381Z] 23:45:47 CRITICAL -  raptor-browsertime Critical: Browsertime process timed out after waiting 120 seconds for output
[task 2022-12-09T23:45:56.381Z] 23:45:47     INFO -  raptor-browsertime-android Info: removing reverse socket connections
[task 2022-12-09T23:45:56.381Z] 23:45:47     INFO -  adb command_output: adb -s R58R36P8E0D wait-for-device reverse --remove-all, timeout: None, timedout: None, exitcode: 0, output:
[task 2022-12-09T23:45:56.381Z] 23:45:47     INFO -  adb shell_bool: adb -s R58R36P8E0D wait-for-device shell su -c 'test -d /sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps', timeout: None, timedout: None, exitcode: 1, output:
[task 2022-12-09T23:45:56.381Z] 23:45:47     INFO -  raptor-mitmproxy Info: Mitmproxy stop!!
[task 2022-12-09T23:45:56.381Z] 23:45:47     INFO -  raptor-mitmproxy Info: Stopping mitmproxy playback, killing process 1159
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO -  raptor-mitmproxy Info: Successfully killed the mitmproxy playback process
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO -  raptor-perftest Info: Removing temporary directory: /tmp/tmp6n_0bm7w
[task 2022-12-09T23:45:56.381Z] 23:45:48    ERROR -  Traceback (most recent call last):
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO -    File "/builds/task_167062888466542/workspace/build/tests/raptor/raptor/raptor.py", line 206, in <module>
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO -      main()
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO -    File "/builds/task_167062888466542/workspace/build/tests/raptor/raptor/raptor.py", line 152, in main
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO -      success = raptor.run_tests(raptor_test_list, raptor_test_names)
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO -    File "/builds/task_167062888466542/workspace/build/tests/raptor/raptor/browsertime/android.py", line 253, in run_tests
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO -      return super(BrowsertimeAndroid, self).run_tests(tests, test_names)
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO -    File "/builds/task_167062888466542/workspace/build/tests/raptor/raptor/perftest.py", line 471, in run_tests
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO -      self.run_test(test, timeout=int(test.get("page_timeout")))
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO -    File "/builds/task_167062888466542/workspace/build/tests/raptor/raptor/browsertime/base.py", line 792, in run_test
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO -      raise Exception(
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO -  Exception: Browsertime process timed out after waiting 120 seconds for output
[task 2022-12-09T23:45:56.381Z] 23:45:48    ERROR - Return code: 1
[task 2022-12-09T23:45:56.381Z] 23:45:48  WARNING - setting return code to 1
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO - Killing logcat pid 667.
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO - Copying Raptor results to upload dir:
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO - /builds/task_167062888466542/workspace/build/blobber_upload_dir/perfherder-data.json
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO - Copying raptor results from /builds/task_167062888466542/workspace/build/raptor.json to /builds/task_167062888466542/workspace/build/blobber_upload_dir/perfherder-data.json
[task 2022-12-09T23:45:56.381Z] 23:45:48 CRITICAL - Error copying results /builds/task_167062888466542/workspace/build/raptor.json to upload dir /builds/task_167062888466542/workspace/build/blobber_upload_dir/perfherder-data.json
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO - [Errno 2] No such file or directory: '/builds/task_167062888466542/workspace/build/raptor.json'
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO - Running post-action listener: _package_coverage_data
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO - Running post-action listener: _resource_record_post_action
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO - Running post-action listener: process_java_coverage_data
[task 2022-12-09T23:45:56.381Z] 23:45:48     INFO - Running post-action listener: stop_device
[task 2022-12-09T23:45:56.381Z] 23:45:50     INFO - Killing logcat pid 667.
[task 2022-12-09T23:45:56.381Z] 23:45:50     INFO - [mozharness: 2022-12-09 23:45:50.603499Z] Finished run-tests step (success)
Flags: needinfo?(gmierz2)

I'm not familiar with these tests, so this could be a red herring, but I notice the screen captures for the instagram failures all show "Update your email address" throughout the test.

Thanks for taking a look :gbrown, I'm working on getting a new recording setup for the instagram one. I have to build the recording locally though because our CI recording task is having some issues: https://treeherder.mozilla.org/jobs?repo=try&revision=33d9dc962ec9827b02489686566c5c68a4e6f883

Summary: Intermittent Android AArch64 raptor-browsertime Critical: Browsertime process timed out after waiting 120 seconds for output → Perma [Tier 2] Android AArch64 raptor-browsertime Critical: Browsertime process timed out after waiting 120 seconds for output
Assignee: nobody → gmierz2
Status: NEW → ASSIGNED

This patch adds the ability to run manual logins for our websites since it can be simpler, and quicker in some cases. At the same time, a bug with the options handling is fixed.

Depends on D164589

Pushed by gmierz2@outlook.com:
https://hg.mozilla.org/integration/autoland/rev/a565a5af8bb8
Re-record instagram test. r=perftest-reviewers,afinder

With the patch that was just pushed, the instagram tests will be fixed. There's still one more patch to finish up here so we can do this more quickly next time.

Flags: needinfo?(gmierz2)
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 110 Branch

(In reply to Pulsebot from comment #37)

Pushed by gmierz2@outlook.com:
https://hg.mozilla.org/integration/autoland/rev/a565a5af8bb8
Re-record instagram test. r=perftest-reviewers,afinder

== Change summary for alert #36472 (as of Sat, 17 Dec 2022 17:25:26 GMT) ==

Regressions:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
824% instagram LastVisualChange android-hw-a51-11-0-aarch64-shippable-qr warm webrender 685.17 -> 6,330.75 Before/After
544% instagram SpeedIndex android-hw-a51-11-0-aarch64-shippable-qr warm webrender 264.83 -> 1,706.08 Before/After
407% instagram loadtime android-hw-a51-11-0-aarch64-shippable-qr warm webrender 213.75 -> 1,083.12 Before/After
405% instagram LastVisualChange android-hw-a51-11-0-aarch64-shippable-qr cold webrender 1,119.42 -> 5,655.25 Before/After
371% instagram SpeedIndex android-hw-a51-11-0-aarch64-shippable-qr cold webrender 638.58 -> 3,007.25 Before/After
358% instagram ContentfulSpeedIndex android-hw-a51-11-0-aarch64-shippable-qr warm webrender 297.42 -> 1,360.75 Before/After
305% instagram ContentfulSpeedIndex android-hw-a51-11-0-aarch64-shippable-qr cold webrender 664.33 -> 2,687.50 Before/After
289% instagram PerceptualSpeedIndex android-hw-a51-11-0-aarch64-shippable-qr cold webrender 626.58 -> 2,439.75 Before/After
132% instagram PerceptualSpeedIndex android-hw-a51-11-0-aarch64-shippable-qr warm webrender 273.50 -> 633.17 Before/After
48% instagram loadtime android-hw-a51-11-0-aarch64-shippable-qr cold webrender 1,024.88 -> 1,517.96 Before/After

Improvements:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
20% instagram FirstVisualChange android-hw-a51-11-0-aarch64-shippable-qr cold webrender 577.92 -> 461.17

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=36472

Flags: needinfo?(gmierz2)

Metric changes are expected because the test was not working before this re-recording.

Flags: needinfo?(gmierz2)
Pushed by gmierz2@outlook.com:
https://hg.mozilla.org/integration/autoland/rev/dcc758b9bb2d
Add a feature for manual logins for recording sites. r=perftest-reviewers,AlexandruIonescu
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: