Open Bug 1921101 Opened 1 year ago Updated 23 hours ago

Frequent perftest-support-class Critical: TEST-UNEXPECTED-FAIL | Some visual metrics have an erroneous value of 0.

Categories

(Testing :: Raptor, defect, P5)

defect

Tracking

(Not tracked)

REOPENED

People

(Reporter: intermittent-bug-filer, Unassigned)

Details

(Keywords: intermittent-failure)

Filed by: abutkovits [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=475792480&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/PC4KNTDfT5efOMXZkeH5yQ/runs/0/artifacts/public/logs/live_backing.log


[task 2024-09-25T19:01:15.251Z] 19:01:12     INFO -  perftest-output Info: PERFHERDER_DATA: {"framework": {"name": "browsertime"}, "suites": [{"name": "youtube-watch", "type": "pageload", "extraOptions": ["cold", "webrender"], "tags": ["cold", "webrender"], "lowerIsBetter": true, "unit": "ms", "alertThreshold": 2.0, "subtests": [{"name": "ContentfulSpeedIndex", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": true, "replicates": [643, 491, 556, 562, 556, 535, 596, 492, 443, 541, 538, 497, 586, 532, 465], "value": 533.2}, {"name": "FirstVisualChange", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": true, "replicates": [489, 416, 443, 476, 410, 458, 459, 423, 361, 458, 408, 388, 442, 388, 375], "value": 424.6}, {"name": "LastVisualChange", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": true, "replicates": [1199, 1129, 1167, 1206, 1099, 1161, 1198, 1151, 1089, 1210, 1119, 1125, 1170, 1099, 1101], "value": 1147.5}, {"name": "PerceptualSpeedIndex", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": true, "replicates": [563, 475, 524, 532, 477, 514, 532, 490, 424, 522, 465, 459, 510, 453, 430], "value": 489.7}, {"name": "SpeedIndex", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": true, "replicates": [681, 575, 637, 615, 590, 600, 638, 587, 522, 604, 547, 572, 617, 567, 524], "value": 590.3}, {"name": "cpuTime", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [9758, 9629, 9490, 9922, 9636, 9758, 9821, 9823, 9075, 9587, 9325, 9609, 9597, 9674, 9435], "value": 9607.0}, {"name": "fcp", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": true, "replicates": [468, 384, 422, 459, 383, 436, 446, 406, 345, 435, 392, 372, 420, 369, 351], "value": 404.1}, {"name": "fnbpaint", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [487, 403, 440, 479, 402, 455, 465, 423, 359, 448, 408, 387, 435, 383, 366], "value": 420.9}, {"name": "largestContentfulPaint", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": true, "replicates": [650, 464, 611, 537, 553, 509, 540, 487, 425, 508, 470, 489, 500, 530, 406], "value": 508.3}, {"name": "loadtime", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": true, "replicates": [654, 572, 614, 665, 559, 600, 643, 587, 530, 656, 577, 580, 628, 554, 510], "value": 593.5}]}, {"name": "youtube-watch", "type": "pageload", "extraOptions": ["webrender", "warm"], "tags": ["webrender", "warm"], "lowerIsBetter": true, "unit": "ms", "alertThreshold": 2.0, "subtests": [{"name": "ContentfulSpeedIndex", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": true, "replicates": [162, 194, 162, 162, 192, 182, 163, 176, 173, 167, 171, 179, 167, 182], "value": 173.4}, {"name": "FirstVisualChange", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": true, "replicates": [146, 194, 145, 146, 175, 166, 146, 159, 157, 150, 154, 162, 151, 165], "value": 157.8}, {"name": "LastVisualChange", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": true, "replicates": [791, 829, 787, 781, 830, 810, 799, 831, 794, 823, 819, 799, 787, 820], "value": 807.0}, {"name": "PerceptualSpeedIndex", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": true, "replicates": [147, 195, 146, 147, 176, 167, 147, 160, 158, 151, 155, 163, 152, 166], "value": 158.8}, {"name": "SpeedIndex", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": true, "replicates": [170, 218, 169, 172, 204, 194, 176, 186, 182, 180, 180, 188, 178, 191], "value": 184.4}, {"name": "cpuTime", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [1906, 2050, 1764, 1546, 1911, 1827, 1814, 1903, 1971, 2093, 2250, 1885, 1969, 1700, 1980], "value": 1897.6}, {"name": "fcp", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": true, "replicates": [137, 145, 136, 132, 156, 166, 130, 148, 140, 142, 409, 128, 150, 140, 148], "value": 152.7}, {"name": "fnbpaint", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [145, 152, 140, 140, 160, 171, 131, 154, 146, 144, 423, 134, 151, 141, 151], "value": 157.5}, {"name": "largestContentfulPaint", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": true, "replicates": [140, 148, 136, 135, 155, 167, 126, 150, 141, 140, 417, 130, 147, 136, 146], "value": 152.9}, {"name": "loadtime", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": true, "replicates": [228, 238, 223, 224, 244, 262, 214, 238, 232, 227, 515, 219, 235, 224, 234], "value": 244.0}]}], "application": {"name": "geckoview"}}
[task 2024-09-25T19:01:15.251Z] 19:01:12     INFO -  perftest-output Info: results can also be found locally at: /builds/task_172728910473501/workspace/build/raptor.json
[task 2024-09-25T19:01:15.251Z] 19:01:12     INFO -  perftest-results-handler Info: Validating PERFHERDER_DATA against /builds/task_172728910473501/workspace/mozharness/external_tools/performance-artifact-schema.json
[task 2024-09-25T19:01:15.251Z] 19:01:12     INFO -  perftest-results-handler Info: Writing video jobs and application data {'jobs': [{'browsertime_json_path': 'browsertime-results/youtube-watch/cold-browsertime.json', 'test_name': 'youtube-watch', 'tags': [], 'extra_options': ['cold', 'webrender'], 'accept_zero_vismet': False}, {'browsertime_json_path': 'browsertime-results/youtube-watch/warm-browsertime.json', 'test_name': 'youtube-watch', 'tags': [], 'extra_options': ['webrender', 'warm'], 'accept_zero_vismet': False}], 'application': {'name': 'geckoview'}, 'extra_options': ['cold', 'webrender']} into /builds/task_172728910473501/workspace/build/blobber_upload_dir/browsertime-results/jobs.json
[task 2024-09-25T19:01:15.251Z] 19:01:12    ERROR -  perftest-support-class Critical: TEST-UNEXPECTED-FAIL | Some visual metrics have an erroneous value of 0.
[task 2024-09-25T19:01:15.251Z] 19:01:12     INFO -  perftest-support-class Info: Visual metric tests failed: ['FirstVisualChange', 'LastVisualChange', 'SpeedIndex', 'PerceptualSpeedIndex', 'ContentfulSpeedIndex']
[task 2024-09-25T19:01:15.251Z] 19:01:12     INFO -  raptor-perftest Info: Removing temporary directory: /tmp/tmp5s7unbi1
[task 2024-09-25T19:01:15.251Z] 19:01:12    ERROR -  raptor-main Critical: TEST-UNEXPECTED-FAIL: no raptor test results were found for youtube-watch
[task 2024-09-25T19:01:15.251Z] 19:01:12     INFO - Return code: 1
[task 2024-09-25T19:01:15.251Z] 19:01:12  WARNING - setting return code to 1
[task 2024-09-25T19:01:15.251Z] 19:01:12     INFO - Killing logcat pid 823.
[task 2024-09-25T19:01:15.251Z] 19:01:12     INFO - Copying Raptor results to upload dir:
[task 2024-09-25T19:01:15.251Z] 19:01:12     INFO - /builds/task_172728910473501/workspace/build/blobber_upload_dir/perfherder-data.json
[task 2024-09-25T19:01:15.251Z] 19:01:12     INFO - Copying raptor results from /builds/task_172728910473501/workspace/build/raptor.json to /builds/task_172728910473501/workspace/build/blobber_upload_dir/perfherder-data.json
[task 2024-09-25T19:01:15.251Z] 19:01:12     INFO - Running post-action listener: _package_coverage_data
[task 2024-09-25T19:01:15.251Z] 19:01:12     INFO - Running post-action listener: _resource_record_post_action
[task 2024-09-25T19:01:15.251Z] 19:01:12     INFO - Running post-action listener: process_java_coverage_data
[task 2024-09-25T19:01:15.251Z] 19:01:12     INFO - Running post-action listener: stop_device
[task 2024-09-25T19:01:15.251Z] 19:01:14     INFO - /data/tombstones/tombstone_00 deleted
[task 2024-09-25T19:01:15.251Z] 19:01:14     INFO - /data/tombstones/tombstone_00.pb deleted
[task 2024-09-25T19:01:15.251Z] 19:01:14     INFO - Killing logcat pid 823.
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → INCOMPLETE
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
Status: REOPENED → RESOLVED
Closed: 1 year ago11 months ago
Resolution: --- → INCOMPLETE
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---

Hi Alex, could you please take a look at this or assign this to someone? I could not find anything relevant on what could have caused this

Recent failure log -> https://treeherder.mozilla.org/logviewer?job_id=528535204&repo=autoland&task=PdoQMDH1SSagAZVucWFyRA.0

Flags: needinfo?(afinder)
Summary: Intermittent perftest-support-class Critical: TEST-UNEXPECTED-FAIL | Some visual metrics have an erroneous value of 0. → Frequent perftest-support-class Critical: TEST-UNEXPECTED-FAIL | Some visual metrics have an erroneous value of 0.

@aerickson, could you please look into the Android issue from the latest Orange Graph ? Something seems to interfere on device RZCY2011M7N with the IMDB test video recording.

Flags: needinfo?(afinder) → needinfo?(aerickson)

I've asked LT to inspect the device. Quarantined.

Flags: needinfo?(aerickson)

The device is doing pretty well (mostly successful jobs) and it should restart after each test run... so if we're regularly seeing this, could it be the test doing it?

LT has fixed RZCY2011M7N. Removing from quarantine.

Alex, can you look into this for windows? the android failures are not the majority here. These seemed to become a normal think around Sept 18th (2 weeks ago).

:jmoss, can you indicate if around Sept 18th any updates were made the the windows11 NUC's ?

Flags: needinfo?(jmoss)
Flags: needinfo?(afinder)

oh wow, good find :sparky. :jmoss ^ video is something that should be interesting. do we have all updates automatically disabled? it is from today.

(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #44)

Alex, can you look into this for windows? the android failures are not the majority here. These seemed to become a normal think around Sept 18th (2 weeks ago).

:jmoss, can you indicate if around Sept 18th any updates were made the the windows11 NUC's ?

Hi Joel! I already notified jmoss on Wednesday about the Windows failures via the Matrix Performance Tests channel (link to the thread). The initial failure was not the same as the more recent Windows update that Sparky pointed out, most of the failures were caused by the Windows System About window interfering with the test runs (here is an example of it showing up in the video recordings for nytimes).

The Windows System About window issue seems to be resolved, but we're now getting the Windows Update issue that was also mentioned in the original Matrix thread on Wednesday and by Sparky in his previous comment.

Latest update from jmoss on that thread mentioned an update on the -alpha pool, he added a try push link. I don't have any additional context about this atm, but maybe jmoss could clarify here.

Looking at the Orange Factor graph in the previous comment from the Intermittent Failure bot, which are all from today, I am still seeing the Windows update showing up in the video recordings.

Flags: needinfo?(afinder)

(In reply to Alex Finder from comment #48)

(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #44)

Alex, can you look into this for windows? the android failures are not the majority here. These seemed to become a normal think around Sept 18th (2 weeks ago).

:jmoss, can you indicate if around Sept 18th any updates were made the the windows11 NUC's ?

Hi Joel! I already notified jmoss on Wednesday about the Windows failures via the Matrix Performance Tests channel (link to the thread). The initial failure was not the same as the more recent Windows update that Sparky pointed out, most of the failures were caused by the Windows System About window interfering with the test runs (here is an example of it showing up in the video recordings for nytimes).

The Windows System About window issue seems to be resolved, but we're now getting the Windows Update issue that was also mentioned in the original Matrix thread on Wednesday and by Sparky in his previous comment.

Latest update from jmoss on that thread mentioned an update on the -alpha pool, he added a try push link. I don't have any additional context about this atm, but maybe jmoss could clarify here.

Looking at the Orange Factor graph in the previous comment from the Intermittent Failure bot, which are all from today, I am still seeing the Windows update showing up in the video recordings.

This appears to be fixed on gecko-t/win11-64-24h2-hw-alpha. I'll roll this change out to gecko-t/win11-64-24h2-hw next week. The change that seemed to cause this was code that we were using to disable unnecessary scheduled tasks that run on Windows. Whenever generic worker runs a task, it creates a new user. It seems that there is a dependent scheduled task (that was disabled) that caused the out of box experience screen to hang longer than anticipated.

Flags: needinfo?(jmoss)

(In reply to :jmoss from comment #49)

(In reply to Alex Finder from comment #48)

(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #44)

Alex, can you look into this for windows? the android failures are not the majority here. These seemed to become a normal think around Sept 18th (2 weeks ago).

:jmoss, can you indicate if around Sept 18th any updates were made the the windows11 NUC's ?

Hi Joel! I already notified jmoss on Wednesday about the Windows failures via the Matrix Performance Tests channel (link to the thread). The initial failure was not the same as the more recent Windows update that Sparky pointed out, most of the failures were caused by the Windows System About window interfering with the test runs (here is an example of it showing up in the video recordings for nytimes).

The Windows System About window issue seems to be resolved, but we're now getting the Windows Update issue that was also mentioned in the original Matrix thread on Wednesday and by Sparky in his previous comment.

Latest update from jmoss on that thread mentioned an update on the -alpha pool, he added a try push link. I don't have any additional context about this atm, but maybe jmoss could clarify here.

Looking at the Orange Factor graph in the previous comment from the Intermittent Failure bot, which are all from today, I am still seeing the Windows update showing up in the video recordings.

This appears to be fixed on gecko-t/win11-64-24h2-hw-alpha. I'll roll this change out to gecko-t/win11-64-24h2-hw next week. The change that seemed to cause this was code that we were using to disable unnecessary scheduled tasks that run on Windows. Whenever generic worker runs a task, it creates a new user. It seems that there is a dependent scheduled task (that was disabled) that caused the out of box experience screen to hang longer than anticipated.

Just an update on this intermittent, seems we're still getting a high failure rate. Has the fix been rolled out to all of those windows11 machines ?

Flags: needinfo?(jmoss)

(In reply to Alex Finder from comment #60)

(In reply to :jmoss from comment #49)

(In reply to Alex Finder from comment #48)

(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #44)

Alex, can you look into this for windows? the android failures are not the majority here. These seemed to become a normal think around Sept 18th (2 weeks ago).

:jmoss, can you indicate if around Sept 18th any updates were made the the windows11 NUC's ?

Hi Joel! I already notified jmoss on Wednesday about the Windows failures via the Matrix Performance Tests channel (link to the thread). The initial failure was not the same as the more recent Windows update that Sparky pointed out, most of the failures were caused by the Windows System About window interfering with the test runs (here is an example of it showing up in the video recordings for nytimes).

The Windows System About window issue seems to be resolved, but we're now getting the Windows Update issue that was also mentioned in the original Matrix thread on Wednesday and by Sparky in his previous comment.

Latest update from jmoss on that thread mentioned an update on the -alpha pool, he added a try push link. I don't have any additional context about this atm, but maybe jmoss could clarify here.

Looking at the Orange Factor graph in the previous comment from the Intermittent Failure bot, which are all from today, I am still seeing the Windows update showing up in the video recordings.

This appears to be fixed on gecko-t/win11-64-24h2-hw-alpha. I'll roll this change out to gecko-t/win11-64-24h2-hw next week. The change that seemed to cause this was code that we were using to disable unnecessary scheduled tasks that run on Windows. Whenever generic worker runs a task, it creates a new user. It seems that there is a dependent scheduled task (that was disabled) that caused the out of box experience screen to hang longer than anticipated.

Just an update on this intermittent, seems we're still getting a high failure rate. Has the fix been rolled out to all of those windows11 machines ?

Not yet unfortunately. It will more than likely be early next week. We are going to be rolling this back to a previous known configuration to resolve these issues.

Flags: needinfo?(jmoss)

Hi jmoss! There has been an increase in failure rate on this intermittent lately. Looking at the artifacts from the failing jobs, it looks like it's still the cmd prompt issue. Can you provide an update on the investigation please ?

Flags: needinfo?(jmoss)

Deploying an update to windows11 hw performance pool today (gecko-t/win11-64-24h2-hw) which should resolve this.

Flags: needinfo?(jmoss)
Whiteboard: [stockwell disable-recommended]
You need to log in before you can comment on or make changes to this bug.