Closed Bug 1548204 Opened 5 years ago Closed 5 years ago

Perma raptor-tp6-8 OSError: [Errno 2] No such file or directory: '/var/folders/27/t6svqqlx251b9v8k9j0hhls000000w/T/tmpC9Vke3'

Categories

(Testing :: Raptor, defect, P5)

Version 3
defect

Tracking

(firefox-esr60 unaffected, firefox67 unaffected, firefox68 unaffected, firefox69 fixed)

RESOLVED FIXED
mozilla69
Tracking Status
firefox-esr60 --- unaffected
firefox67 --- unaffected
firefox68 --- unaffected
firefox69 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: garbas)

References

(Regression)

Details

(Keywords: intermittent-failure, regression)

Attachments

(1 file)

Filed by: apavel [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=243835650&repo=mozilla-central
Full log: https://queue.taskcluster.net/v1/task/bXGTZjBEQ02pNEs3bgTXMA/runs/0/artifacts/public/logs/live_backing.log


05:31:53 INFO - raptor-main starting firefox
05:31:53 INFO - Application command: /Users/cltbld/tasks/task_1556686586/build/application/Firefox Nightly.app/Contents/MacOS/firefox -foreground -profile /var/folders/27/t6svqqlx251b9v8k9j0hhls000000w/T/tmp7GvRLR.mozrunner
05:31:53 INFO - raptor-main initializing gecko profiler
05:31:53 INFO - Clearing archive /Users/cltbld/tasks/task_1556686586/build/blobber_upload_dir/profile_raptor-tp6-linkedin-firefox.zip
05:31:53 INFO - Activating gecko profiling, temp profile dir: /var/folders/27/t6svqqlx251b9v8k9j0hhls000000w/T/tmpC9Vke3, interval: 1, entries: 2000000
05:31:56 INFO - raptor-control-server received webext_status: raptor runner.js is loaded!

05:34:00 INFO - raptor-main removing webext /Users/cltbld/tasks/task_1556686586/build/tests/raptor/raptor/../webext/raptor
05:34:05 INFO - Wed May 1 05:34:00 2019 MainThread Retrieving symbol zip from https://queue.taskcluster.net/v1/task/C_0cLfb-TZq1kmYwH38DxA/artifacts/public/build/target.crashreporter-symbols.zip...
05:34:05 INFO - raptor-control-server shutting down control server
05:34:05 INFO - raptor-main finished
05:34:05 ERROR - Traceback (most recent call last):
05:34:05 INFO - File "/Users/cltbld/tasks/task_1556686586/build/tests/raptor/raptor/raptor.py", line 964, in <module>
05:34:05 INFO - main()
05:34:05 INFO - File "/Users/cltbld/tasks/task_1556686586/build/tests/raptor/raptor/raptor.py", line 927, in main
05:34:05 INFO - raptor.run_test(next_test, timeout=int(next_test['page_timeout']))
05:34:05 INFO - File "/Users/cltbld/tasks/task_1556686586/build/tests/raptor/raptor/raptor.py", line 457, in run_test
05:34:05 INFO - self.run_test_teardown()
05:34:05 INFO - File "/Users/cltbld/tasks/task_1556686586/build/tests/raptor/raptor/raptor.py", line 470, in run_test_teardown
05:34:05 INFO - super(RaptorDesktop, self).run_test_teardown()
05:34:05 INFO - File "/Users/cltbld/tasks/task_1556686586/build/tests/raptor/raptor/raptor.py", line 189, in run_test_teardown
05:34:05 INFO - self.gecko_profiler.symbolicate()
05:34:05 INFO - File "/Users/cltbld/tasks/task_1556686586/build/tests/raptor/raptor/gecko_profile.py", line 171, in symbolicate
05:34:05 INFO - for profile_filename in os.listdir(self.gecko_profile_dir):
05:34:05 INFO - OSError: [Errno 2] No such file or directory: '/var/folders/27/t6svqqlx251b9v8k9j0hhls000000w/T/tmpC9Vke3'
05:34:05 ERROR - Return code: 1
05:34:05 WARNING - setting return code to 1
05:34:05 INFO - copying raptor results to upload dir:
05:34:05 INFO - /Users/cltbld/tasks/task_1556686586/build/blobber_upload_dir/perfherder-data.json
05:34:05 INFO - copying raptor results from /Users/cltbld/tasks/task_1556686586/build/raptor.json to /Users/cltbld/tasks/task_1556686586/build/blobber_upload_dir/perfherder-data.json
05:34:05 CRITICAL - Error copying results /Users/cltbld/tasks/task_1556686586/build/raptor.json to upload dir /Users/cltbld/tasks/task_1556686586/build/blobber_upload_dir/perfherder-data.json
05:34:05 INFO - [Errno 2] No such file or directory: u'/Users/cltbld/tasks/task_1556686586/build/raptor.json'
05:34:05 INFO - Running post-action listener: _package_coverage_data
05:34:05 INFO - Running post-action listener: _resource_record_post_action
05:34:05 INFO - Running post-action listener: process_java_coverage_data
05:34:05 INFO - Running post-action listener: stop_device
05:34:05 INFO - [mozharness: 2019-05-01 05:34:05.760067Z] Finished run-tests step (success)
05:34:05 INFO - Running post-run listener: _resource_record_post_run
05:34:05 INFO - Total resource usage - Wall time: 257s; CPU: 50.0%; Read bytes: 331445760; Write bytes: 1661545984; Read time: 1513; Write time: 3701
05:34:05 INFO - TinderboxPrint: CPU usage<br/>50.2%
05:34:05 INFO - TinderboxPrint: I/O read bytes / time<br/>331,445,760 / 1,513
05:34:05 INFO - TinderboxPrint: I/O write bytes / time<br/>1,661,545,984 / 3,701
05:34:05 INFO - TinderboxPrint: CPU idle<br/>511.8 (49.8%)
05:34:05 INFO - TinderboxPrint: CPU system<br/>241.1 (23.4%)
05:34:05 INFO - TinderboxPrint: CPU user<br/>275.6 (26.8%)
05:34:05 INFO - TinderboxPrint: Swap in / out<br/>616,218,624 / 0
05:34:05 INFO - install - Wall time: 25s; CPU: 54.0%; Read bytes: 259876864; Write bytes: 260581376; Read time: 20835; Write time: 560
05:34:05 INFO - run-tests - Wall time: 232s; CPU: 50.0%; Read bytes: 297096704; Write bytes: 1367524864; Read time: 1306; Write time: 3059
05:34:05 WARNING - returning nonzero exit status 1
[taskcluster 2019-05-01T05:34:06.010Z] Exit Code: 1

Bob, if you have time, can you take a look at this?

Flags: needinfo?(bob)

this is very similar to Bug 1548008 in that the test fails due to

05:34:00     INFO -  mozproxy Aborting: Mitmproxy process did not startup

The code fails during the teardown since self.gecko_profile_dir has not been populated.

There are two issues here:

  1. the mitmproxy failure

  2. raptor's failure to gracefully terminate when the test fails to start properly and raptor has not fully initialized.

whimboo, rwood: can you take it from here?

Flags: needinfo?(rwood)
Flags: needinfo?(hskupin)
Flags: needinfo?(bob)

There is also https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=243839389&repo=mozilla-central&lineNumber=809 that seems to be the same issue here?

06:22:20 INFO - raptor-main starting firefox
06:22:20 INFO - Application command: C:\Users\task_1556689150\build\application\firefox\firefox.exe --wait-for-browser -profile c:\users\task_1556689150\appdata\local\temp\tmpmwmfc1.mozrunner
06:22:20 INFO - raptor-main initializing gecko profiler
06:22:20 INFO - Clearing archive C:\Users\task_1556689150\build\blobber_upload_dir\profile_raptor-tp6-linkedin-firefox.zip
06:22:20 INFO - Activating gecko profiling, temp profile dir: c:\users\task_1556689150\appdata\local\temp\tmpckztyn, interval: 1, entries: 2000000

06:23:35 INFO - raptor-main test uses playback tool: mitmproxy
06:23:35 INFO - mozproxy mozproxy_dir used for mitmproxy downloads and exe files: C:\Users\task_1556689150\testing\mozproxy
06:23:35 INFO - mozproxy downloading mitmproxy binary
06:23:35 INFO - mozproxy INFO - Cache matches, avoiding extracting in 'mitmdump-win-2.0.2'
06:23:35 INFO - mozproxy downloading mitmproxy pageset
06:23:35 INFO - mozproxy INFO - File mitmproxy-tp6-wikipedia.zip retrieved from local cache c:\tooltool-cache
06:23:35 INFO - mozproxy INFO - unzipping "mitmproxy-tp6-wikipedia.zip"
06:23:36 INFO - mozproxy mitmdump path: C:\Users\task_1556689150\testing\mozproxy\mitmdump
06:23:36 INFO - mozproxy browser path: C:\Users\task_1556689150\build\application\firefox\firefox.exe
06:23:36 INFO - mozproxy Starting mitmproxy playback using env path: C:\Users\task_1556689150\build\application\firefox;C:\Users\task_1556689150\build\venv\Scripts;C:\windows\system32;C:\windows;C:\windows\System32\Wbem;C:\windows\System32\WindowsPowerShell\v1.0;C:\Program Files\Mellanox\MLNX_VPI\IB\Tools;C:\Program Files\Mellanox\MLNX_CIMProvider\lib\mft;C:\windows\system32\config\systemprofile\AppData\Local\Microsoft\WindowsApps;C:\Program Files (x86)\Windows Kits\8.1\Windows Performance Toolkit;C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit;;C:\Program Files\Mercurial;C:\mozilla-build\bin;C:\mozilla-build\kdiff3;C:\mozilla-build\moztools-x64\bin;C:\mozilla-build\mozmake;C:\mozilla-build\msys\bin;C:\mozilla-build\msys\local\bin;C:\mozilla-build\nsis-3.01;C:\mozilla-build\python;C:\mozilla-build\python\Scripts;C:\mozilla-build\python3;C:\Users\task_1556689150\AppData\Local\Microsoft\WindowsApps
06:23:36 INFO - mozproxy Starting mitmproxy playback using command: C:\Users\task_1556689150\testing\mozproxy\mitmdump --replay-kill-extra --script ""C:\\Users\\task_1556689150\\build\\tests\\raptor\\raptor\\playback\\alternate-server-replay-2.0.2.py C:\\Users\\task_1556689150\\testing\\mozproxy\\wikipedia.mp""
06:24:06 INFO - mozproxy Aborting: Mitmproxy process did not startup
06:24:06 INFO - mozproxy Stopping mitmproxy playback, killing process 6240
06:24:06 INFO - mozproxy Mitmproxy exited with error code 1
06:24:06 INFO - raptor-main removing webext C:\Users\task_1556689150\build\tests\raptor\raptor..\webext\raptor
06:24:07 INFO - Wed May 1 06:24:06 2019 MainThread Retrieving symbol zip from https://queue.taskcluster.net/v1/task/KLUhHwJGRNWzb3f6L_VWNw/artifacts/public/build/target.crashreporter-symbols.zip...
06:24:07 INFO - raptor-control-server shutting down control server
06:24:07 INFO - raptor-main finished
06:24:07 ERROR - Traceback (most recent call last):
06:24:07 INFO - File "C:\Users\task_1556689150\build\tests\raptor\raptor\raptor.py", line 964, in <module>
06:24:07 INFO - main()
06:24:07 INFO - File "C:\Users\task_1556689150\build\tests\raptor\raptor\raptor.py", line 927, in main
06:24:07 INFO - raptor.run_test(next_test, timeout=int(next_test['page_timeout']))
06:24:07 INFO - File "C:\Users\task_1556689150\build\tests\raptor\raptor\raptor.py", line 457, in run_test
06:24:07 INFO - self.run_test_teardown()
06:24:07 INFO - File "C:\Users\task_1556689150\build\tests\raptor\raptor\raptor.py", line 470, in run_test_teardown
06:24:07 INFO - super(RaptorDesktop, self).run_test_teardown()
06:24:07 INFO - File "C:\Users\task_1556689150\build\tests\raptor\raptor\raptor.py", line 189, in run_test_teardown
06:24:07 INFO - self.gecko_profiler.symbolicate()
06:24:07 INFO - File "C:\Users\task_1556689150\build\tests\raptor\raptor\gecko_profile.py", line 171, in symbolicate
06:24:07 INFO - for profile_filename in os.listdir(self.gecko_profile_dir):
06:24:07 INFO - WindowsError: [Error 3] The system cannot find the path specified: 'c:\users\task_1556689150\appdata\local\temp\tmpckztyn\.'
06:24:07 ERROR - Return code: 1
06:24:07 WARNING - setting return code to 1
06:24:07 INFO - copying raptor results to upload dir:
06:24:07 INFO - C:\Users\task_1556689150\build\blobber_upload_dir\perfherder-data.json
06:24:07 INFO - copying raptor results from C:\Users\task_1556689150\build\raptor.json to C:\Users\task_1556689150\build\blobber_upload_dir\perfherder-data.json
06:24:07 CRITICAL - Error copying results C:\Users\task_1556689150\build\raptor.json to upload dir C:\Users\task_1556689150\build\blobber_upload_dir\perfherder-data.json
06:24:07 INFO - [Errno 2] No such file or directory: u'C:\Users\task_1556689150\build\raptor.json'
06:24:07 INFO - Running post-action listener: _package_coverage_data
06:24:07 INFO - Running post-action listener: _resource_record_post_action
06:24:07 INFO - Running post-action listener: process_java_coverage_data
06:24:07 INFO - Running post-action listener: stop_device
06:24:07 INFO - [mozharness: 2019-05-01 06:24:07.564000Z] Finished run-tests step (success)

Also filed bug 1548218 that might be a duplicate of this one.

(In reply to Andreea Pavel [:apavel] from comment #3)

There is also https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=243839389&repo=mozilla-central&lineNumber=809 that seems to be the same issue here?

looks like it. the difference is just the difference in paths on linux and windows.

Also filed bug 1548218 that might be a duplicate of this one.

Yes, I believe it is.

:tarek, has something changed in mozproxy that you are aware of, that would cause this to start? Thanks!

Flags: needinfo?(rwood) → needinfo?(tarek)

Why did no-one from the sheriffs run a back-fill on mozilla-central? For a perma failure this should happen by default. So I did so now:

https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&searchStr=rap-prof%2Ctp6-8&fromchange=e8766f96041a&tochange=90234f4c094d

I believe this is not related to Raptor or mitmproxy at all, but the gecko profiler changes as listed in this merge:

https://hg.mozilla.org/mozilla-central/pushloghtml?changeset=ee0dd3b092d03d28da65f5ee686db942b9ce8ec0

Flags: needinfo?(hskupin)

Oh, those particular builds were actually broken, so that no Raptor job could run anyway. Maybe someone from the sheriffs could do the backfill on autoland instead?

Flags: needinfo?(tarek) → needinfo?(aryx.bugmail)
Flags: needinfo?(aryx.bugmail) → needinfo?(rgarbas)
Regressed by: 1533330
Summary: Perma raptor OSError: [Errno 2] No such file or directory: '/var/folders/27/t6svqqlx251b9v8k9j0hhls000000w/T/tmpC9Vke3' → Perma raptor-tp6-8 OSError: [Errno 2] No such file or directory: '/var/folders/27/t6svqqlx251b9v8k9j0hhls000000w/T/tmpC9Vke3'

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #7)

Why did no-one from the sheriffs run a back-fill on mozilla-central? For a perma failure this should happen by default. So I did so now:

Actually, I did run backfill on this job, however it stopped to the merge where the jobs got busted, thus I had no conclusion as to where this started specifically.

https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&searchStr=rap-prof%2Ctp6-8&fromchange=e8766f96041a&tochange=90234f4c094d

Filtering just for this test will now show you the busted pushes, thus it might create issues in seeing why the job does not run on them.

I believe this is not related to Raptor or mitmproxy at all, but the gecko profiler changes as listed in this merge:

https://hg.mozilla.org/mozilla-central/pushloghtml?changeset=ee0dd3b092d03d28da65f5ee686db942b9ce8ec0

Also, you can ask on #sherrifs what has been done or not on a bug.

Answers are inline.

Maybe Rok is currently not around. Rail, you reviewed the patch, maybe you have a chance to look at this issue?

Flags: needinfo?(rail)

:whimboo hey, just returned from the short PTO break. I'm looking into this issue as I write this.

Flags: needinfo?(rgarbas)
Flags: needinfo?(rail)

(In reply to Rok Garbas [:garbas] from comment #13)

:whimboo hey, just returned from the short PTO break. I'm looking into this issue as I write this.

Hey Rok, do you have an update on that issue?

Flags: needinfo?(rgarbas)

:whimboo sorry, it has just been slow iteration since change to tooltool.py takes some time (4-5h) to re-run everything. i think i now found the source of the problem, but let me first explain what is happening.

  • there are 2 versions of tooltool.py in tree and there is the github repository where tooltool.py should be copied from
  • to make everything even more "fun" all three tooltool.py scripts were a bit different.
  • with my patch in Bug 1533330 i added possibility to use taskcluster client credentials, but also tried to bring all three tooltool.py scripts on the same page.
  • last 2 days i have been doing different try runs with different combinations of changes and found out that Bug 1540540 was the things that caused it.
  • yesterday evening and today morning i tried to bullet proof the patch from Bug 1540540, but i still don't understand why raptor test fails

my proposal to go forward:

  • i will prepare a patch for tooltool.py (later today) that excludes Bug 1540540 change and send it to try.
  • i will then notify author of Bug 1540540 change that the patch was reverted and will try to work with them to include it
  • reverting Bug 1540540 should not be that problematic revert since it is only optimization (introduces caching when unpacking)
  • there is another bug opened (Bug 1436612) which would address the root of the problem for this issues. maybe it would be good to include it in someones okr so that this failures don't happen in the future.
Flags: needinfo?(rgarbas)

I suspect this may be a duplicate of bug 1548574.

Pushed by rgarbas@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/eab176d7b755
fixing test for raptor-tp6-8 r=rail
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla69
Assignee: nobody → rgarbas
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: