Closed Bug 1811992 Opened 1 year ago Closed 1 year ago

High frequency raptor-browsertime Critical: Browsertime process timed out after 3015 seconds

Categories

(Testing :: Performance, defect, P5)

defect

Tracking

(firefox111 fixed)

RESOLVED FIXED
111 Branch
Tracking Status
firefox111 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: afinder)

References

Details

(Keywords: intermittent-failure, Whiteboard: [retriggered][stockwell disable-recommended])

Attachments

(2 files)

Filed by: imoraru [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=403318025&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/NLG5hjBMSwimE6NnwMtzyA/runs/0/artifacts/public/logs/live_backing.log


[task 2023-01-24T03:27:30.017Z] 03:26:31     INFO -  raptor-browsertime Info: Cycle 1, waiting for 1000 ms
[task 2023-01-24T03:27:30.017Z] 03:26:32     INFO -  raptor-browsertime Info: Cycle 1, starting the measure
[task 2023-01-24T03:27:30.017Z] 03:26:32     INFO -  raptor-browsertime Info: Testing url https://m.imdb.com/ iteration 15
[task 2023-01-24T03:27:30.017Z] 03:26:32     INFO -  raptor-browsertime Info: Tue, 24 Jan 2023 03:26:32 GMT adb:command Send '001Ahost:transport:RZ8R31DFH4F'
[task 2023-01-24T03:27:30.017Z] 03:26:32     INFO -  raptor-browsertime Info: Tue, 24 Jan 2023 03:26:32 GMT adb:command Send '003Dshell:screenrecord --bit-rate 8000000 /sdcard/browsertime.mp4'
[task 2023-01-24T03:27:30.017Z] 03:26:43     INFO -  raptor-browsertime Info: https://m.imdb.com/ has been tested before within the same run, it will get an extra query parameter named browsertime_run. Make sure to use alias to keep track of the URLs
[task 2023-01-24T03:27:30.017Z] 03:26:43     INFO -  raptor-browsertime Info: Tue, 24 Jan 2023 03:26:43 GMT adb:command Send '001Ahost:transport:RZ8R31DFH4F'
[task 2023-01-24T03:27:30.017Z] 03:26:43     INFO -  raptor-browsertime Info: Tue, 24 Jan 2023 03:26:43 GMT adb:command Send '0088shell:command -v pkill >/dev/null && pkill -l SIGINT screenrecord || kill -2 $(ps screenrecord | grep -Eo [0-9]+ | grep -m 1 -Eo [0-9]+)'
[task 2023-01-24T03:27:30.017Z] 03:27:03     INFO -  raptor-browsertime Info: Tue, 24 Jan 2023 03:27:03 GMT adb:command Send '001Ahost:transport:RZ8R31DFH4F'
[task 2023-01-24T03:27:30.017Z] 03:27:03     INFO -  raptor-browsertime Info: Tue, 24 Jan 2023 03:27:03 GMT adb:command Send '0005sync:'
[task 2023-01-24T03:27:30.017Z] 03:27:03     INFO -  raptor-browsertime Info: Tue, 24 Jan 2023 03:27:03 GMT adb:sync RECV /sdcard/browsertime.mp4
[task 2023-01-24T03:27:30.017Z] 03:27:03     INFO -  raptor-browsertime Info: Tue, 24 Jan 2023 03:27:03 GMT adb:command Send '001Ahost:transport:RZ8R31DFH4F'
[task 2023-01-24T03:27:30.017Z] 03:27:03     INFO -  raptor-browsertime Info: Tue, 24 Jan 2023 03:27:03 GMT adb:command Send '002Dshell:rm -- "/sdcard//sdcard/browsertime.mp4"'
[task 2023-01-24T03:27:30.017Z] 03:27:03     INFO -  raptor-browsertime Info: CPU / Power usage: 69544
[task 2023-01-24T03:27:30.017Z] 03:27:04     INFO -  raptor-browsertime Info: Browsertime pageload ended.
[task 2023-01-24T03:27:30.017Z] 03:27:07     INFO -  raptor-browsertime Info: Tue, 24 Jan 2023 03:27:07 GMT adb:command Send '001Ahost:transport:RZ8R31DFH4F'
[task 2023-01-24T03:27:30.017Z] 03:27:07     INFO -  raptor-browsertime Info: Tue, 24 Jan 2023 03:27:07 GMT adb:command Send '0040shell:dumpsys battery | grep temperature | grep -Eo '[0-9]{1,3}''
[task 2023-01-24T03:27:30.017Z] 03:27:07     INFO -  raptor-browsertime Info: Use the visual metrics portable script
[task 2023-01-24T03:27:30.017Z] 03:27:07     INFO -  raptor-browsertime Info: Get visual metrics from the video
[task 2023-01-24T03:27:30.017Z] 03:27:23     INFO -  raptor-browsertime Info: Converting video to 60 fps
[task 2023-01-24T03:27:30.017Z] 03:27:26 CRITICAL -  raptor-browsertime Critical: Browsertime process timed out after 3015 seconds
[task 2023-01-24T03:27:30.017Z] 03:27:26     INFO -  raptor-browsertime-android Info: removing reverse socket connections
[task 2023-01-24T03:27:30.017Z] 03:27:26     INFO -  adb command_output: adb -s RZ8R31DFH4F wait-for-device reverse --remove-all, timeout: None, timedout: None, exitcode: 0, output:
[task 2023-01-24T03:27:30.017Z] 03:27:26     INFO -  adb shell_bool: adb -s RZ8R31DFH4F wait-for-device shell su -c 'test -d /sdcard/Android/data/org.mozilla.geckoview_example/files/test_root/org.mozilla.geckoview_example-geckodriver-profile/minidumps', timeout: None, timedout: None, exitcode: 1, output:
[task 2023-01-24T03:27:30.017Z] 03:27:26     INFO -  raptor-mitmproxy Info: Mitmproxy stop!!
[task 2023-01-24T03:27:30.017Z] 03:27:26     INFO -  raptor-mitmproxy Info: Stopping mitmproxy playback, killing process 1152
[task 2023-01-24T03:27:30.017Z] 03:27:27    ERROR -  raptor-mitmproxy Error: Mitmproxy exited with error code -9
[task 2023-01-24T03:27:30.017Z] 03:27:27     INFO -  raptor-perftest Info: Removing temporary directory: /tmp/tmpdq9xoet4
[task 2023-01-24T03:27:30.017Z] 03:27:29    ERROR -  Traceback (most recent call last):
[task 2023-01-24T03:27:30.017Z] 03:27:29     INFO -    File "/builds/task_167452758585889/workspace/build/tests/raptor/raptor/raptor.py", line 204, in <module>
[task 2023-01-24T03:27:30.017Z] 03:27:29     INFO -      main()
[task 2023-01-24T03:27:30.017Z] 03:27:29     INFO -    File "/builds/task_167452758585889/workspace/build/tests/raptor/raptor/raptor.py", line 150, in main
[task 2023-01-24T03:27:30.017Z] 03:27:29     INFO -      success = raptor.run_tests(raptor_test_list, raptor_test_names)
[task 2023-01-24T03:27:30.017Z] 03:27:29     INFO -    File "/builds/task_167452758585889/workspace/build/tests/raptor/raptor/browsertime/android.py", line 251, in run_tests
[task 2023-01-24T03:27:30.017Z] 03:27:29     INFO -      return super(BrowsertimeAndroid, self).run_tests(tests, test_names)
[task 2023-01-24T03:27:30.017Z] 03:27:29     INFO -    File "/builds/task_167452758585889/workspace/build/tests/raptor/raptor/perftest.py", line 469, in run_tests
[task 2023-01-24T03:27:30.017Z] 03:27:29     INFO -      self.run_test(test, timeout=int(test.get("page_timeout")))
[task 2023-01-24T03:27:30.017Z] 03:27:29     INFO -    File "/builds/task_167452758585889/workspace/build/tests/raptor/raptor/browsertime/base.py", line 806, in run_test
[task 2023-01-24T03:27:30.017Z] 03:27:29     INFO -      raise Exception(
[task 2023-01-24T03:27:30.017Z] 03:27:29     INFO -  Exception: Browsertime process timed out after 3015 seconds
[task 2023-01-24T03:27:38.487Z] 03:27:29    ERROR - Return code: 1
[task 2023-01-24T03:27:38.487Z] 03:27:29  WARNING - setting return code to 1
[task 2023-01-24T03:27:38.487Z] 03:27:29     INFO - Killing logcat pid 659.
[task 2023-01-24T03:27:38.487Z] 03:27:29     INFO - Copying Raptor results to upload dir:
[task 2023-01-24T03:27:38.487Z] 03:27:29     INFO - /builds/task_167452758585889/workspace/build/blobber_upload_dir/perfherder-data.json
[task 2023-01-24T03:27:38.487Z] 03:27:29     INFO - Copying raptor results from /builds/task_167452758585889/workspace/build/raptor.json to /builds/task_167452758585889/workspace/build/blobber_upload_dir/perfherder-data.json
[task 2023-01-24T03:27:38.487Z] 03:27:29 CRITICAL - Error copying results /builds/task_167452758585889/workspace/build/raptor.json to upload dir /builds/task_167452758585889/workspace/build/blobber_upload_dir/perfherder-data.json
[task 2023-01-24T03:27:38.487Z] 03:27:29     INFO - [Errno 2] No such file or directory: '/builds/task_167452758585889/workspace/build/raptor.json'
[task 2023-01-24T03:27:38.487Z] 03:27:29     INFO - Running post-action listener: _package_coverage_data

Tier 1 failure here

Summary: Intermittent [tier 2] raptor-browsertime Critical: Browsertime process timed out after 3015 seconds → Intermittent raptor-browsertime Critical: Browsertime process timed out after 3015 seconds

It is getting frequent, with a failure rate of 4 out 7 on retriggers, please see range - and it currently holds 102 failures within the last 7 days.

Alex, can you help us assign this to someone so they can look into it?
Thank you.

Flags: needinfo?(afinder)
Whiteboard: [retriggered][stockwell needswork:owner]
Summary: Intermittent raptor-browsertime Critical: Browsertime process timed out after 3015 seconds → High frequency raptor-browsertime Critical: Browsertime process timed out after 3015 seconds
Assignee: nobody → afinder
Flags: needinfo?(afinder)

Alex, any updates on this frequent fail?

Flags: needinfo?(afinder)

Submitted a patch with changes suggested by sparky. Try push is currently in progress.

Flags: needinfo?(afinder)
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]
Pushed by gmierz2@outlook.com:
https://hg.mozilla.org/integration/autoland/rev/0dbfd261facd
Disable 60fps video conversion to avoid timeout r=perftest-reviewers,sparky
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered]
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 111 Branch

(In reply to Pulsebot from comment #15)

Pushed by gmierz2@outlook.com:
https://hg.mozilla.org/integration/autoland/rev/0dbfd261facd
Disable 60fps video conversion to avoid timeout r=perftest-reviewers,sparky
== Change summary for alert #37009 (as of Wed, 08 Feb 2023 20:56:52 GMT) ==

Regressions:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
217% reddit-billgates-post-2.hot LastVisualChange windows10-64-shippable-qr cold fission webrender 525.38 -> 1,667.96
172% reddit-billgates-post-1.posts LastVisualChange windows10-64-shippable-qr cold fission webrender 558.73 -> 1,519.25
113% reddit-billgates-post-1.billg LastVisualChange windows10-64-shippable-qr cold fission webrender 3,208.69 -> 6,836.00
112% reddit-billgates-post-2.billg LastVisualChange windows10-64-shippable-qr cold fission webrender 3,231.85 -> 6,852.58
87% reddit-billgates-ama.members LastVisualChange windows10-64-shippable-qr cold fission webrender 967.81 -> 1,805.46
12% reddit-billgates-post-1.posts SpeedIndex windows10-64-shippable-qr cold fission webrender 74.25 -> 83.50
8% reddit-billgates-ama.billg-ama SpeedIndex windows10-64-shippable-qr cold fission webrender 896.12 -> 965.12
8% reddit-billgates-post-1.billg SpeedIndex windows10-64-shippable-qr cold fission webrender 483.71 -> 520.92
5% youtube ContentfulSpeedIndex windows10-64-shippable-qr cold fission webrender 1,165.92 -> 1,219.75 Before/After
4% youtube ContentfulSpeedIndex windows10-64-shippable-qr fission warm webrender 1,075.77 -> 1,121.50 Before/After
... ... ... ... ... ...
4% bing-search SpeedIndex windows10-64-shippable-qr bytecode-cached fission warm webrender 141.08 -> 146.58 Before/After
3% bing-search SpeedIndex windows10-64-shippable-qr fission warm webrender 140.46 -> 145.33 Before/After
3% imgur loadtime windows10-64-shippable-qr fission warm webrender 546.65 -> 563.50 Before/After
3% imdb SpeedIndex windows10-64-shippable-qr fission warm webrender 307.23 -> 315.42 Before/After
3% imgur fcp windows10-64-shippable-qr fission warm webrender 479.31 -> 491.58 Before/After

Improvements:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
52% cnn-nav.world FirstVisualChange windows10-64-shippable-qr cold fission webrender 371.62 -> 177.96
41% reddit-billgates-post-1.posts ContentfulSpeedIndex windows10-64-shippable-qr cold fission webrender 248.96 -> 146.88
32% reddit-billgates-post-2.hot FirstVisualChange windows10-64-shippable-qr cold fission webrender 97.46 -> 66.42
31% reddit-billgates-post-2.hot PerceptualSpeedIndex windows10-64-shippable-qr cold fission webrender 235.38 -> 161.83
24% reddit-billgates-post-2.top ContentfulSpeedIndex windows10-64-shippable-qr cold fission webrender 154.71 -> 118.29
... ... ... ... ... ...
2% twitter SpeedIndex windows10-64-shippable-qr fission warm webrender 767.29 -> 751.92 Before/After

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=37009

Attached video 6-original.mp4

Windows seems to be busted now with those changes, but all the other platforms are fine.

Not sure why the video is corrupted since I can view it locally. The issue is that the video is in black-and-white with vertical bars of alternating contrast which is not what we're expecting.

Locally I'm able to produce corrupted videos on Windows with those two flags. When I remove the two flags, then everything works. Given that we've only had this failure on Android, we can change where we set these flags to be android-specific.

Regressions: 1816917
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: