Closed Bug 1506936 Opened 10 months ago Closed 2 months ago

Intermittent raptor-main TEST-UNEXPECTED-FAIL: test 'raptor-tp6-facebook-firefox' timed out loading test page: https://www.facebook.com

Categories

(Testing :: Raptor, defect, P5)

Version 3
defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: intermittent-bug-filer, Assigned: rwood)

References

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell disabled])

Attachments

(1 file)

Filed by: rgurzau [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=211488631&repo=mozilla-central

https://queue.taskcluster.net/v1/task/TCo3WizOR9qhG1a_QE8Zpg/runs/0/artifacts/public/logs/live_backing.log

18:04:25    ERROR -  PID 3940 | JavaScript error: resource://gre/modules/WebProgressChild.jsm, line 58: TypeError: this.mm.content is null; can't access its "document" property
18:04:25     INFO -  PID 3940 | [Child 5096, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
18:04:25     INFO -  PID 3940 | [Child 5096, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
18:04:42     INFO -  raptor-mitmproxy Stopping mitmproxy playback, klling process 5568
18:04:52     INFO -  raptor-mitmproxy Successfully killed the mitmproxy playback process
18:04:52     INFO -  raptor-mitmproxy Turning off the browser proxy
18:04:52     INFO -  raptor-main removing webext C:\Users\task_1542131003\build\tests\raptor\webext\raptor
18:04:52     INFO -  results-handler summarizing raptor test results
18:04:52     INFO -  raptor-output ignoring the first dcf value due to initial pageload noise
18:04:52     INFO -  raptor-output ignoring the first fnbpaint value due to initial pageload noise
18:04:52     INFO -  raptor-output ignoring the first hero:hero1 value due to initial pageload noise
18:04:52     INFO -  raptor-output ignoring the first ttfi value due to initial pageload noise
18:04:52     INFO -  raptor-output ignoring the first dcf value due to initial pageload noise
18:04:52     INFO -  raptor-output ignoring the first fnbpaint value due to initial pageload noise
18:04:52     INFO -  raptor-output ignoring the first hero:hero1 value due to initial pageload noise
18:04:52     INFO -  raptor-output ignoring the first ttfi value due to initial pageload noise
18:04:52     INFO -  raptor-output ignoring the first dcf value due to initial pageload noise
18:04:52     INFO -  raptor-output ignoring the first fnbpaint value due to initial pageload noise
18:04:52     INFO -  raptor-output ignoring the first hero:hero1 value due to initial pageload noise
18:04:52     INFO -  raptor-output PERFHERDER_DATA: {"framework": {"name": "raptor"}, "suites": [{"extraOptions": [], "name": "raptor-tp6-amazon-firefox", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 477.44, "subtests": [{"name": "raptor-tp6-amazon-firefox-dcf", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 334.0, "replicates": [774, 319, 351, 318, 323, 387, 396, 427, 337, 400, 320, 327, 397, 311, 317, 529, 416, 388, 404, 324, 410, 331, 326, 316, 323], "unit": "ms"}, {"name": "raptor-tp6-amazon-firefox-fnbpaint", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 262.0, "replicates": [649, 262, 291, 258, 265, 262, 258, 292, 263, 269, 263, 267, 259, 242, 255, 402, 264, 262, 261, 259, 262, 259, 262, 261, 266], "unit": "ms"}, {"name": "raptor-tp6-amazon-firefox-hero:hero1", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 871.0, "replicates": [1161, 871, 911, 967, 840, 848, 883, 862, 887, 853, 890, 839, 829, 844, 871, 984, 884, 880, 872, 888, 840, 830, 889, 865, 868], "unit": "ms"}, {"name": "raptor-tp6-amazon-firefox-ttfi", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 681.0, "replicates": [1514, 770, 637, 785, 725, 616, 730, 734, 620, 745, 619, 621, 623, 595, 737, 756, 633, 610, 726, 739, 628, 621, 618, 726, 737], "unit": "ms"}], "type": "pageload", "unit": "ms"}, {"extraOptions": [], "name": "raptor-tp6-google-firefox", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 145.13, "subtests": [{"name": "raptor-tp6-google-firefox-dcf", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 114.0, "replicates": [435, 103, 128, 107, 128, 113, 122, 115, 108, 107, 105, 144, 128, 112, 117, 131, 122, 118, 115, 108, 108, 103, 126, 105, 108], "unit": "ms"}, {"name": "raptor-tp6-google-firefox-fnbpaint", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 122.0, "replicates": [425, 122, 123, 121, 124, 108, 137, 110, 125, 122, 120, 141, 123, 127, 112, 127, 116, 114, 110, 123, 123, 122, 121, 122, 124], "unit": "ms"}, {"name": "raptor-tp6-google-firefox-hero:hero1", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 259.0, "replicates": [632, 239, 250, 245, 222, 268, 328, 215, 285, 289, 289, 359, 220, 226, 219, 289, 233, 219, 219, 291, 292, 274, 219, 273, 290], "unit": "ms"}, {"name": "raptor-tp6-google-firefox-ttfi", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 123.0, "replicates": [492, 115, 122, 120, 123, 124, 136, 125, 124, 121, 119, 140, 123, 126, 126, 126, 133, 129, 125, 122, 122, 117, 120, 119, 123], "unit": "ms"}], "type": "pageload", "unit": "ms"}, {"extraOptions": [], "name": "raptor-tp6-youtube-firefox", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 371.47, "subtests": [{"name": "raptor-tp6-youtube-firefox-dcf", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 579.0, "replicates": [951, 620, 640, 669, 514, 591, 603, 519, 495, 554, 590, 504, 575, 587, 600, 509, 546, 570, 534, 591, 583, 526, 604, 590, 551], "unit": "ms"}, {"name": "raptor-tp6-youtube-firefox-fnbpaint", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 102.0, "replicates": [164, 86, 85, 85, 95, 94, 132, 100, 93, 116, 100, 100, 100, 152, 130, 98, 115, 124, 115, 103, 129, 101, 137, 124, 118], "unit": "ms"}, {"name": "raptor-tp6-youtube-firefox-hero:hero1", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 864.0, "replicates": [1225, 882, 937, 870, 724, 858, 901, 719, 708, 1261, 799, 738, 806, 787, 961, 717, 938, 892, 752, 854, 894, 732, 875, 918, 901], "unit": "ms"}], "type": "pageload", "unit": "ms"}]}
18:04:52     INFO -  raptor-output results can also be found locally at: C:\Users\task_1542131003\build\raptor.json
18:04:52     INFO -  raptor-control-server shutting down control server
18:04:52     INFO -  raptor-main finished
18:04:52     INFO -  raptor-main TEST-UNEXPECTED-FAIL: test 'raptor-tp6-facebook-firefox' timed out loading test page: https://www.facebook.com
18:04:52     INFO -  P/1.1" 200 -
18:04:52     INFO -  127.0.0.1 - - [13/Nov/2018 18:04:25] "POST / HTTP/1.1" 200 -
WARNING | IO Completion Port failed to signal process shutdown
Parent process 4600 exited with children alive:
PIDS: 1984, 5100, 6220, 5480, 5876
Attempting to kill them, but no guarantee of success
18:07:58    ERROR - Return code: 1

I believe this is from bug 1522111. I tried adding the jobs on the merge where this bug landed, however they were (still are) in queue for the last ~5h.

https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&resultStatus=running%2Cpending%2Crunnable&revision=62a8571d3b92170966779d37f26e9e9215380a1e&group_state=expanded&selectedJob=230020192&searchStr=windows%2C10%2Cx64%2Chw%2Cux%2Copt%2Craptor%2Cperformance%2Ctests%2Con%2Cfirefox%2Cwith%2Ce10s%2Ctest-windows10-64-ux%2Fopt-raptor-tp6-1-firefox-e10s%2Crap-e10s%28tp6-1%29

Justin can you provide some input here?

There are 232 total failures since the 20th of February.

Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=230034734&repo=autoland&lineNumber=601

04:27:37 INFO - raptor-main starting firefox
04:27:37 INFO - Application command: /Users/cltbld/tasks/task_1550890182/build/application/Firefox Nightly.app/Contents/MacOS/firefox -foreground -profile /var/folders/d4/8__57x3x48l6x022y1wft1xm00000x/T/tmpW6FZS_.mozrunner
04:27:38 INFO - 127.0.0.1 - - [22/Feb/2019 20:27:38] "POST / HTTP/1.1" 200 -
04:27:38 INFO - raptor-control-server received webext_status: raptor runner.js is loaded!
04:27:38 INFO - 127.0.0.1 - - [22/Feb/2019 20:27:38] "GET /raptor-tp6-youtube-firefox.json HTTP/1.1" 200 -
04:27:38 INFO - raptor-control-server reading test settings from raptor-tp6-youtube-firefox.json
04:30:19 INFO - raptor-mitmproxy Turning off the browser proxy
04:30:19 INFO - raptor-mitmproxy writing: /Users/cltbld/tasks/task_1550890182/build/application/Firefox Nightly.app/Contents/Resources/distribution/policies.json
04:30:19 INFO - raptor-main removing webext /Users/cltbld/tasks/task_1550890182/build/tests/raptor/webext/raptor
04:30:19 INFO - results-handler summarizing raptor test results
04:30:19 INFO - raptor-output ignoring the first fcp value due to initial pageload noise
04:30:19 INFO - raptor-output turning on subtest alerting for measurement type: fcp
04:30:19 INFO - raptor-output ignoring the first fnbpaint value due to initial pageload noise
04:30:19 INFO - raptor-output ignoring the first hero:hero1 value due to initial pageload noise
04:30:19 INFO - raptor-output ignoring the first loadtime value due to initial pageload noise
04:30:19 INFO - raptor-output turning on subtest alerting for measurement type: loadtime
04:30:19 INFO - raptor-output ignoring the first dcf value due to initial pageload noise
04:30:19 INFO - raptor-output ignoring the first ttfi value due to initial pageload noise
04:30:19 INFO - raptor-output ignoring the first fcp value due to initial pageload noise
04:30:19 INFO - raptor-output turning on subtest alerting for measurement type: fcp
04:30:19 INFO - raptor-output ignoring the first fnbpaint value due to initial pageload noise
04:30:19 INFO - raptor-output ignoring the first hero:hero1 value due to initial pageload noise
04:30:19 INFO - raptor-output ignoring the first loadtime value due to initial pageload noise
04:30:19 INFO - raptor-output turning on subtest alerting for measurement type: loadtime
04:30:19 INFO - raptor-output ignoring the first dcf value due to initial pageload noise
04:30:19 INFO - raptor-output ignoring the first ttfi value due to initial pageload noise
04:30:19 INFO - raptor-output ignoring the first dcf value due to initial pageload noise
04:30:19 INFO - raptor-output ignoring the first fcp value due to initial pageload noise
04:30:19 INFO - raptor-output turning on subtest alerting for measurement type: fcp
04:30:19 INFO - raptor-output ignoring the first fnbpaint value due to initial pageload noise
04:30:19 INFO - raptor-output ignoring the first hero:hero1 value due to initial pageload noise
04:30:19 INFO - raptor-output ignoring the first loadtime value due to initial pageload noise
04:30:19 INFO - raptor-output turning on subtest alerting for measurement type: loadtime
04:30:19 INFO - raptor-output PERFHERDER_DATA: {"framework": {"name": "raptor"}, "suites": [{"extraOptions": [], "name": "raptor-tp6-amazon-firefox", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 863.6, "subtests": [{"name": "fcp", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 573.5, "shouldAlert": true, "replicates": [1232, 649, 545, 676, 544, 663, 548, 491, 545, 605, 540, 604, 478, 679, 556, 590, 508, 627, 610, 811, 553, 519, 697, 574, 573], "unit": "ms"}, {"name": "fnbpaint", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 540.5, "replicates": [1156, 591, 520, 644, 521, 622, 515, 464, 513, 561, 509, 575, 454, 647, 529, 568, 478, 602, 581, 773, 521, 492, 654, 541, 540], "unit": "ms"}, {"name": "hero:hero1", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 1062.5, "replicates": [1641, 973, 1035, 1313, 805, 1242, 1090, 732, 1210, 991, 1185, 1316, 1017, 1146, 983, 931, 728, 1183, 880, 1151, 1135, 930, 1369, 921, 1141], "unit": "ms"}, {"name": "loadtime", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 772.0, "shouldAlert": true, "replicates": [1419, 803, 719, 896, 693, 853, 774, 635, 932, 858, 891, 1008, 689, 897, 701, 770, 645, 761, 764, 1031, 740, 810, 971, 760, 702], "unit": "ms"}, {"name": "dcf", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 698.5, "replicates": [1379, 762, 684, 877, 660, 818, 653, 599, 661, 742, 625, 719, 630, 829, 669, 685, 612, 728, 729, 998, 721, 693, 853, 704, 670], "unit": "ms"}, {"name": "ttfi", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 2334.0, "replicates": [3629, 2509, 2289, 2617, 2065, 2399, 2242, 1947, 2138, 2452, 2319, 6588, 2204, 2349, 2313, 2568, 2231, 2197, 2446, 2262, 2412, 2635, 2614, 2433, 2166], "unit": "ms"}], "type": "pageload", "unit": "ms"}, {"extraOptions": [], "name": "raptor-tp6-google-firefox", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 292.91, "subtests": [{"name": "fcp", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 226.0, "shouldAlert": true, "replicates": [913, 242, 251, 225, 230, 219, 226, 227, 243, 210, 267, 233, 226, 212, 275, 218, 227, 238, 210, 215, 241, 215, 216, 226, 214], "unit": "ms"}, {"name": "fnbpaint", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 204.5, "replicates": [835, 212, 223, 200, 205, 205, 212, 200, 220, 198, 238, 205, 201, 199, 249, 204, 198, 225, 195, 202, 214, 201, 203, 211, 200], "unit": "ms"}, {"name": "hero:hero1", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 399.5, "replicates": [1039, 305, 379, 276, 317, 298, 446, 428, 288, 422, 540, 558, 425, 292, 608, 427, 462, 283, 420, 437, 512, 306, 320, 299, 285], "unit": "ms"}, {"name": "loadtime", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 249.0, "shouldAlert": true, "replicates": [916, 279, 330, 246, 270, 261, 259, 248, 247, 244, 327, 259, 246, 241, 323, 247, 245, 243, 237, 246, 266, 250, 271, 255, 243], "unit": "ms"}, {"name": "dcf", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 199.0, "replicates": [816, 199, 205, 204, 217, 191, 199, 205, 187, 184, 218, 210, 205, 203, 230, 191, 203, 186, 181, 189, 224, 180, 182, 196, 187], "unit": "ms"}, {"name": "ttfi", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 689.0, "replicates": [1578, 734, 763, 717, 829, 664, 612, 596, 713, 607, 743, 775, 704, 766, 929, 610, 690, 545, 612, 621, 701, 634, 664, 688, 591], "unit": "ms"}], "type": "pageload", "unit": "ms"}, {"extraOptions": [], "name": "raptor-tp6-youtube-firefox", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 985.55, "subtests": [{"name": "dcf", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 1301.5, "replicates": [2047, 1550, 1516, 1808, 1287, 1195, 1270, 1309, 1193, 1423, 1312, 999, 1280, 2304, 1244, 1246, 1294, 1360, 1336, 1200, 1336, 1336, 1270, 2525, 1277], "unit": "ms"}, {"name": "fcp", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 750.5, "shouldAlert": true, "replicates": [1359, 826, 825, 890, 796, 656, 636, 738, 694, 551, 870, 554, 715, 928, 758, 727, 740, 694, 793, 747, 784, 704, 754, 781, 793], "unit": "ms"}, {"name": "fnbpaint", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 283.0, "replicates": [237, 187, 231, 310, 281, 211, 250, 329, 318, 191, 360, 186, 224, 348, 281, 274, 285, 298, 266, 336, 288, 308, 260, 303, 305], "unit": "ms"}, {"name": "hero:hero1", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 1983.5, "replicates": [2714, 2165, 2074, 2326, 1987, 1650, 1953, 2057, 1673, 2006, 1797, 1632, 1741, 2857, 1894, 2014, 1980, 2022, 3477, 1732, 1912, 2095, 1725, 3040, 1812], "unit": "ms"}, {"name": "loadtime", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 1693.0, "shouldAlert": true, "replicates": [2429, 1932, 1844, 2281, 1691, 1537, 1673, 1857, 1442, 1514, 1613, 1460, 1522, 2614, 1610, 1834, 1864, 1704, 3384, 1335, 1695, 1795, 1615, 2892, 1583], "unit": "ms"}], "type": "pageload", "unit": "ms"}]}
04:30:19 INFO - raptor-output results can also be found locally at: /Users/cltbld/tasks/task_1550890182/build/raptor.json
04:30:19 INFO - raptor-control-server shutting down control server
04:30:19 INFO - raptor-main finished
04:30:19 INFO - raptor-main TEST-UNEXPECTED-FAIL: test 'raptor-tp6-facebook-firefox' timed out loading test page: https://www.facebook.com
04:30:19 ERROR - Return code: 1
04:30:19 WARNING - setting return code to 1
04:30:20 INFO - Validating PERFHERDER_DATA against /Users/cltbld/tasks/task_1550890182/mozharness/external_tools/performance-artifact-schema.json

Flags: needinfo?(bugspam.Callek)
Whiteboard: [stockwell needswork:owner]

this looks to be a perma fail- 156 failures in 1 day; we should turn this off- I see on mozilla-central this started end of day thursday/early friday:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&searchStr=raptor&fromchange=e92ff56d2be21676b447c6fbb87b4c4479539bc9

if we want to turn this off, we can edit:
https://searchfox.org/mozilla-central/source/taskcluster/ci/test/raptor.yml#44

and add a:
run-on-projects: ['try']

I assume rwood will know what is wrong and can fix it when he is online Monday.

Flags: needinfo?(fstrugariu)
Flags: needinfo?(dave.hunt)

I don't see the issue off-hand but I'll put up a patch to temporarily disable raptor-tp6-facebook, looks like this happens on Firefox only and not on the Raptor Chromium job.

Flags: needinfo?(rwood)
Pushed by rwood@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/05dc22d0131b
Temporarily disable raptor-tp6-facebook-firefox due to permafail; r=jmaher,davehunt
Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2e2e65e57b8c
Temporarily disable raptor-tp6-facebook-firefox due to permafail. r=jmaher,davehunt

After looking a bit a the failures and having a green try with the modifications I made I decided to push this on tree: https://treeherder.mozilla.org/#/jobs?repo=try&resultStatus=success%2Cpending%2Crunning%2Ctestfailed%2Cbusted%2Cexception&classifiedState=unclassified&revision=47236ab7167401dc2d8794c173b823eaba8a6b69&selectedJob=230105894

Carring the r+ that Robert had. Please feel free to backout or modify if something isn't how it should be. Thanks.

Flags: needinfo?(rwood)
Status: NEW → RESOLVED
Closed: 7 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67

Thank you :NarcisB and :nataliaCs for taking care of that - apologies I should have pushed my fix to try first. :(

I also filed Bug 1530350 to change the Raptor unit tests so disabling one won't cause the unit tests to fail.

Blocks: 1530353
Flags: needinfo?(bugspam.Callek)
Assignee: nobody → rwood

This was only disabled so reopening, restoring flags and putting the leave-open tag on.

Status: RESOLVED → REOPENED
Keywords: leave-open
Resolution: FIXED → ---
Whiteboard: [stockwell disable-recommended] → [stockwell disabled]
Target Milestone: mozilla67 → ---
Flags: needinfo?(fstrugariu)

This has been re-enabled via bug 1530353 and the failures have returned to intermittent.

Flags: needinfo?(dave.hunt)

There are 77 total failures in the last 7 days

Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=240684217&repo=autoland&lineNumber=1640

15:38:23 INFO - raptor-output PERFHERDER_DATA: {"framework": {"name": "raptor"}, "suites": [{"extraOptions": [], "name": "raptor-tp6-amazon-firefox", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 462.08, "subtests": [{"name": "dcf", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 393.0, "replicates": [972, 399, 385, 387, 402, 399, 394, 392, 376, 390, 392, 365, 422, 415, 406, 397, 374, 362, 400, 381, 388, 338, 410, 405, 404], "unit": "ms"}, {"name": "fcp", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 358.0, "shouldAlert": true, "replicates": [983, 368, 357, 358, 370, 367, 359, 355, 350, 355, 354, 334, 385, 375, 371, 372, 350, 341, 358, 357, 356, 340, 380, 382, 367], "unit": "ms"}, {"name": "fnbpaint", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 338.0, "replicates": [918, 339, 337, 332, 344, 337, 339, 328, 333, 329, 338, 313, 361, 355, 343, 351, 333, 313, 334, 339, 338, 310, 353, 357, 345], "unit": "ms"}, {"name": "loadtime", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 958.0, "shouldAlert": true, "replicates": [1817, 1026, 878, 999, 948, 955, 970, 935, 965, 922, 964, 908, 1017, 969, 941, 981, 947, 945, 942, 998, 916, 906, 987, 961, 977], "unit": "ms"}], "type": "pageload", "unit": "ms"}, {"extraOptions": [], "name": "raptor-tp6-google-firefox", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 264.83, "subtests": [{"name": "dcf", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 213.5, "replicates": [498, 217, 212, 222, 198, 212, 197, 219, 212, 214, 232, 218, 205, 230, 223, 240, 213, 212, 207, 206, 204, 228, 196, 235, 220], "unit": "ms"}, {"name": "fcp", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 232.0, "shouldAlert": true, "replicates": [553, 227, 224, 231, 231, 220, 232, 244, 222, 225, 243, 231, 236, 237, 231, 248, 243, 247, 218, 239, 239, 240, 229, 262, 232], "unit": "ms"}, {"name": "fnbpaint", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 215.5, "replicates": [491, 213, 208, 219, 215, 207, 215, 215, 207, 210, 228, 214, 220, 225, 219, 236, 209, 230, 203, 224, 222, 224, 213, 231, 216], "unit": "ms"}, {"name": "loadtime", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 460.5, "shouldAlert": true, "replicates": [1423, 493, 502, 444, 461, 288, 460, 438, 435, 415, 488, 458, 467, 450, 461, 500, 496, 477, 408, 292, 471, 454, 442, 467, 463], "unit": "ms"}], "type": "pageload", "unit": "ms"}, {"extraOptions": [], "name": "raptor-tp6-youtube-firefox", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 435.18, "subtests": [{"name": "dcf", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 572.5, "replicates": [1402, 627, 619, 665, 539, 572, 568, 567, 573, 583, 572, 560, 615, 591, 579, 562, 566, 593, 558, 590, 568, 647, 553, 593, 560], "unit": "ms"}, {"name": "fcp", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 386.5, "shouldAlert": true, "replicates": [1190, 428, 419, 442, 365, 386, 387, 380, 381, 393, 378, 361, 425, 405, 392, 374, 385, 404, 376, 402, 381, 465, 382, 408, 376], "unit": "ms"}, {"name": "fnbpaint", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 198.0, "replicates": [448, 182, 185, 190, 197, 201, 202, 183, 201, 197, 194, 177, 240, 220, 232, 190, 201, 209, 192, 218, 195, 267, 199, 223, 193], "unit": "ms"}, {"name": "loadtime", "lowerIsBetter": true, "alertThreshold": 2.0, "value": 817.5, "shouldAlert": true, "replicates": [2064, 830, 820, 869, 750, 882, 769, 1018, 775, 784, 868, 877, 815, 872, 912, 767, 886, 794, 869, 799, 769, 910, 785, 807, 767], "unit": "ms"}], "type": "pageload", "unit": "ms"}]}
15:38:23 INFO - raptor-output results can also be found locally at: C:\Users\task_1555427573\build\raptor.json
15:38:23 INFO - raptor-control-server shutting down control server
15:38:23 INFO - raptor-main finished
15:38:23 INFO - raptor-main TEST-UNEXPECTED-FAIL: test 'raptor-tp6-facebook-firefox' timed out loading test page: https://www.facebook.com pending metrics: fcp, fnb paint, dcf, load time
15:38:23 INFO - 0 -
15:38:23 INFO - 127.0.0.1 - - [16/Apr/2019 15:37:49] "POST / HTTP/1.1" 200 -
15:38:23 INFO - 127.0.0.1 - - [16/Apr/2019 15:37:49] "POST / HTTP/1.1" 200 -
15:38:23 INFO - 127.0.0.1 - - [16/Apr/2019 15:37:49] "POST / HTTP/1.1" 200 -

Robert are there updates here?

Flags: needinfo?(rwood)

:marauder, could you please have a look at this and see if you can reproduce it locally? Maybe that will give some clues. Thanks!

Flags: needinfo?(rwood) → needinfo?(marian.raiciof)

:rwood the test failed locally after 10 runs.

INFO - raptor-control-server received webext_raptor-page-timeout: [u'raptor-tp6-facebook-firefox', u'https://www.facebook.com', {u'fcp': True, u'hero': False, u'dcf': True, u'fnb paint': True, u'ttfi': False, u'load time': True}]

All metrics are pending when it fails (considering the fact that hero and ttfi have been removed).
I'll continue the tests after i'll finish the cold measurements for noise detection.

Thanks!

Flags: needinfo?(marian.raiciof)

Over the last 7 days there are 41 failures on this bug. These happen on: linux64-qr, linux64-shippable, linux64-shippable-qr, osx-10-10-shippable, windows10-64-qr, windows10-64-shippable, windows7-32-shippable

Here is the latest failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=242442942&repo=autoland&lineNumber=1241

Flags: needinfo?(rwood)

So having a look at the log it looks like that the very last page cycle causes this problem:

https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=242442942&repo=autoland&lineNumber=795-899

In this case 24 cycles were successful but the very last one fails. That's a bit strange.

From the code in runner.js I can see that a screenshot could be taken. I wonder why we don't do it in such a case.

Rob, is there a reason for? Such a screenshot would certainly help a lot.

Maybe in case of failures we should always screenshot and not make it dependent on the settings.screen_capture value.

I did two push to try with screen_capture enabled.
fuzzy - https://treeherder.mozilla.org/#/jobs?repo=try&revision=22b94428688371c96908097f7811fcbda5562086
full - https://treeherder.mozilla.org/#/jobs?repo=try&revision=ddf21b2e662db5a3535a29ca2210c339e006d946

I'll retrigger the job several times to get the screenshots when the test will fail.

On local environment, the page is loaded but the script fails to get the metrics - it fails on awaiting results:
awaiting results... runner.js:245:11

LOG: https://drive.google.com/file/d/1eg2Wtb4T2XooytVpAhTuYwECuXB0wleS/view?usp=sharing

begin pagecycle 15 runner.js:548:13
posting to control server runner.js:563:13
begin pagecycle 15 runner.js:564:13
now is 1556026606836, set raptor alarm raptor-page-timeout to expire at 1556026666836 runner.js:451:11
post success runner.js:555:15
post success runner.js:555:15

update tab: 1 runner.js:548:13
posting to control server runner.js:563:13
update tab: 1 runner.js:564:13

test tab updated: 1 runner.js:548:13
posting to control server runner.js:563:13
test tab updated: 1 runner.js:564:13
awaiting results... runner.js:245:11

post success 2 runner.js:555:15 ---> I don't know where is that "2" coming from

raptor-page-timeout on https://www.facebook.com runner.js:421:11
posting to control server runner.js:563:13
Array(3) [ "raptor-tp6-facebook-firefox", "https://www.facebook.com", {…} ]
runner.js:564:13
closed tab 1 runner.js:578:13
pageload test finished runner.js:589:11

__raptor_shutdownBrowser runner.js:548:13
posting to control server runner.js:563:13
__raptor_shutdownBrowser runner.js:564:13

Removed tab: 1 runner.js:548:13
posting to control server runner.js:563:13
Removed tab: 1 runner.js:564:13
post success 2 runner.js:555:15
post success runner.js:555:15

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #47)

Maybe in case of failures we should always screenshot and not make it dependent on the settings.screen_capture value.

So we had initially decided against enabling screenshots by default in production just because of the large number of jobs on all the devices/browsers/platforms, could potentially create a large number of screenshot artifacts (even when just taking a screenshot on timeouts). So we just relied on setting screen capture on and pushing to try to reproduce timeouts. Right now if you enable screen capture it will take a screen capture after every page cycle (not just on a timeout).

Maybe we should reconsider this. If you want to file a new bug to screen capture only on timeouts? Maybe a new test INI setting "screen_capture_timeouts"? Or just enable screen captures automatically for timeouts (regardless of the the screen capture INI setting)? Open to ideas, whatever you think, we might as well give it a try. Thanks!

Flags: needinfo?(rwood)

Other harnesses do always screenshot Firefox when a failure occurs. And that is pretty helpful. The number of failures is low as what I can see when checking on various repositories in Treeherder. And just a couple more kB aren't a problem in S3. So I would really suggest to always screenshot, and not bind it to another setting. I filed bug 1547069 for that.

Closing this as it's not happening anymore and we re-recorded the Facebook test

Status: REOPENED → RESOLVED
Closed: 7 months ago2 months ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.