Intermittent-infra rm: cannot remove `mozharness/mozharness/lib/__init__.py': Permission denied

RESOLVED DUPLICATE of bug 1311442

Status

Taskcluster
General
RESOLVED DUPLICATE of bug 1311442
2 years ago
2 years ago

People

(Reporter: Treeherder Bug Filer, Assigned: gbrown)

Tracking

({intermittent-failure})

Details

Comment 1

2 years ago
7 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 7

Platform breakdown:
* android-4-3-armv7-api15: 4
* linux64: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304930&startday=2016-09-19&endday=2016-09-25&tree=all

Comment 2

2 years ago
26 automation job failures were associated with this bug yesterday.

Repository breakdown:
* autoland: 26

Platform breakdown:
* android-4-3-armv7-api15: 25
* linux64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304930&startday=2016-09-28&endday=2016-09-28&tree=all

Comment 3

2 years ago
32 automation job failures were associated with this bug yesterday.

Repository breakdown:
* autoland: 29
* mozilla-inbound: 3

Platform breakdown:
* android-4-3-armv7-api15: 28
* linux64: 3
* android-api-15-gradle: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304930&startday=2016-09-29&endday=2016-09-29&tree=all

Comment 4

2 years ago
128 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 125
* mozilla-inbound: 3

Platform breakdown:
* android-4-3-armv7-api15: 120
* linux64: 5
* android-api-15-gradle: 2
* android-4-2-x86: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304930&startday=2016-09-26&endday=2016-10-02&tree=all

Comment 5

2 years ago
34 automation job failures were associated with this bug yesterday.

Repository breakdown:
* autoland: 34

Platform breakdown:
* android-4-3-armv7-api15: 30
* android-4-2-x86: 2
* linux64: 1
* android-api-15-gradle: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304930&startday=2016-10-08&endday=2016-10-08&tree=all

Comment 6

2 years ago
48 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 48

Platform breakdown:
* android-4-3-armv7-api15: 42
* android-api-15-gradle: 3
* android-4-2-x86: 2
* linux64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304930&startday=2016-10-03&endday=2016-10-09&tree=all
Duplicate of this bug: 1310343

Comment 8

2 years ago
28 automation job failures were associated with this bug yesterday.

Repository breakdown:
* autoland: 28

Platform breakdown:
* android-4-3-armv7-api15: 25
* linux64: 2
* android-api-15-gradle: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304930&startday=2016-10-14&endday=2016-10-14&tree=all

Comment 9

2 years ago
28 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 28

Platform breakdown:
* android-4-3-armv7-api15: 25
* linux64: 2
* android-api-15-gradle: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304930&startday=2016-10-10&endday=2016-10-16&tree=all

Comment 10

2 years ago
16 automation job failures were associated with this bug yesterday.

Repository breakdown:
* autoland: 16

Platform breakdown:
* android-4-3-armv7-api15: 12
* android-api-15-gradle: 2
* android-4-2-x86: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304930&startday=2016-10-17&endday=2016-10-17&tree=all
[task 2016-09-23T02:08:28.548058Z] + '[' https://queue.taskcluster.net/v1/task/VRy74JC7QhW8FGw0aJ5J1A/artifacts/public/build/mozharness.zip ']'
[task 2016-09-23T02:08:28.548152Z] + curl --fail -o mozharness.zip --retry 10 -L https://queue.taskcluster.net/v1/task/VRy74JC7QhW8FGw0aJ5J1A/artifacts/public/build/mozharness.zip
[task 2016-09-23T02:08:28.636540Z]   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
[task 2016-09-23T02:08:28.636623Z]                                  Dload  Upload   Total   Spent    Left  Speed
[task 2016-09-23T02:08:28.636967Z] 
[task 2016-09-23T02:08:29.129488Z]   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
[task 2016-09-23T02:08:29.129558Z] 100    29  100    29    0     0     50      0 --:--:-- --:--:-- --:--:--    58
[task 2016-09-23T02:08:29.129634Z] 100    29  100    29    0     0     50      0 --:--:-- --:--:-- --:--:--    58
[task 2016-09-23T02:08:29.462327Z] 
[task 2016-09-23T02:08:29.462389Z] 100   241  100   241    0     0    265      0 --:--:-- --:--:-- --:--:--   265
[task 2016-09-23T02:08:29.556805Z] 
[task 2016-09-23T02:08:29.556915Z] 100  646k  100  646k    0     0   644k      0  0:00:01  0:00:01 --:--:--  644k
[task 2016-09-23T02:08:29.558326Z] + rm -rf mozharness
[task 2016-09-23T02:08:29.560132Z] rm: cannot remove `mozharness/mozharness/lib/__init__.py': Permission denied
[task 2016-09-23T02:08:29.560276Z] rm: cannot remove `mozharness/mozharness/lib/python/__init__.py': Permission denied
[task 2016-09-23T02:08:29.560356Z] rm: cannot remove `mozharness/mozharness/lib/python/authentication.pyc': Permission denied
[task 2016-09-23T02:08:29.560457Z] rm: cannot remove `mozharness/mozharness/lib/python/authentication.py': Permission denied
[task 2016-09-23T02:08:29.560520Z] rm: cannot remove `mozharness/mozharness/lib/python/__init__.pyc': Permission denied
[task 2016-09-23T02:08:29.560612Z] rm: cannot remove `mozharness/mozharness/lib/__init__.pyc': Permission denied
[task 2016-09-23T02:08:29.560711Z] rm: cannot remove `mozharness/mozharness/__init__.py': Permission denied
[task 2016-09-23T02:08:29.560773Z] rm: cannot remove `mozharness/mozharness/base/__init__.py': Permission denied
[task 2016-09-23T02:08:29.560887Z] rm: cannot remove `mozharness/mozharness/base/log.py': Permission denied
[task 2016-09-23T02:08:29.560949Z] rm: cannot remove `mozharness/mozharness/base/signing.py': Permission denied
[task 2016-09-23T02:08:29.561004Z] rm: cannot remove `mozharness/mozharness/base/transfer.py': Permission denied
[task 2016-09-23T02:08:29.561133Z] rm: cannot remove `mozharness/mozharness/base/log.pyc': Permission denied
...
See Also: → bug 1303543

Comment 12

2 years ago
26 automation job failures were associated with this bug yesterday.

Repository breakdown:
* autoland: 26

Platform breakdown:
* android-4-3-armv7-api15: 25
* linux64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304930&startday=2016-10-18&endday=2016-10-18&tree=all
why is this a problem primarily on android vs linux?  is it possible that when we get into an error state on the previous run that we have file/directory permissions issues?  That theory works great on our old environments, but now we are running inside of docker.  I believe we are running a fresh docker instance for each job, so there would be no issues?

looking at a few data points from (https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304930&startday=2016-10-04&endday=2016-10-18&tree=all), I see:
mysql> select date,result,slave,buildtype,testtype,platform,revision from testjobs where slave='i-058b7b4977d0ea163';
+---------------------+------------+---------------------+-----------+----------------------------------+-------------------------+--------------+
| date                | result     | slave               | buildtype | testtype                         | platform                | revision     |
+---------------------+------------+---------------------+-----------+----------------------------------+-------------------------+--------------+
| 2016-10-17 21:43:11 | success    | i-058b7b4977d0ea163 | opt       | marionette-harness               | linux64                 | c6aba573441e |
| 2016-10-17 22:01:51 | testfailed | i-058b7b4977d0ea163 | debug     | web-platform-tests-e10s-8        | linux64                 | 2c81db21bb49 |
| 2016-10-17 22:08:50 | testfailed | i-058b7b4977d0ea163 | debug     | web-platform-tests-e10s-1        | linux64                 | 9fa614d8310d |
| 2016-10-17 22:13:20 | testfailed | i-058b7b4977d0ea163 | opt       | gtest                            | linux64                 | 7cd613179479 |
| 2016-10-17 22:39:36 | success    | i-058b7b4977d0ea163 | opt       | xpcshell-1                       | android-4-2-x86         | 47be3ae8a710 |
| 2016-10-17 22:11:44 | success    | i-058b7b4977d0ea163 | debug     | jsreftest-14                     | android-4-3-armv7-api15 | c22f0df11dd8 |
| 2016-10-17 18:08:45 | success    | i-058b7b4977d0ea163 | debug     | web-platform-tests-reftests-e10s | linux64                 | e71ee6ee03c3 |
| 2016-10-17 21:18:25 | success    | i-058b7b4977d0ea163 | opt       | mochitest-19                     | android-4-3-armv7-api15 | 6c3893afbb49 |
+---------------------+------------+---------------------+-----------+----------------------------------+-------------------------+--------------+
8 rows in set (4 min 41.81 sec)

and another one:
mysql> select date,result,slave,buildtype,testtype,platform,revision from testjobs where slave='i-076cf9bae1963aa54';
+---------------------+------------+---------------------+-----------+-----------------------------+-------------------------+--------------+
| date                | result     | slave               | buildtype | testtype                    | platform                | revision     |
+---------------------+------------+---------------------+-----------+-----------------------------+-------------------------+--------------+
| 2016-10-18 23:08:17 | testfailed | i-076cf9bae1963aa54 | debug     | web-platform-tests-reftests | linux64                 | d74a9133f9a5 |
| 2016-10-18 20:32:44 | testfailed | i-076cf9bae1963aa54 | opt       | mochitest-9                 | android-4-3-armv7-api15 | b8fcce612cfb |
| 2016-10-18 20:12:30 | success    | i-076cf9bae1963aa54 | opt       | marionette-harness          | linux64                 | debc3dfbc36a |
| 2016-10-18 23:04:50 | testfailed | i-076cf9bae1963aa54 | debug     | reftest-7                   | android-4-3-armv7-api15 | 5cd0ba0180c3 |
| 2016-10-18 23:33:00 | testfailed | i-076cf9bae1963aa54 | asan      | gtest                       | linux64                 | 41d7a4864f33 |
| 2016-10-18 19:46:26 | testfailed | i-076cf9bae1963aa54 | debug     | reftest-46                  | android-4-3-armv7-api15 | ef7b401149f2 |
| 2016-10-18 17:27:49 | success    | i-076cf9bae1963aa54 | debug     | reftest-26                  | android-4-3-armv7-api15 | 25e25969288c |
| 2016-10-18 19:46:26 | testfailed | i-076cf9bae1963aa54 | debug     | crashtest-8                 | android-4-3-armv7-api15 | ef7b401149f2 |
| 2016-10-18 18:20:41 | success    | i-076cf9bae1963aa54 | opt       | reftest-7                   | android-4-3-armv7-api15 | 545b89141c83 |
| 2016-10-18 21:14:56 | success    | i-076cf9bae1963aa54 | debug     | mochitest-2                 | android-4-3-armv7-api15 | 4abaf0bda1fe |
| 2016-10-18 19:26:51 | testfailed | i-076cf9bae1963aa54 | pgo       | web-platform-tests-5        | linux64                 | f62a24bd53b5 |
| 2016-10-18 20:21:47 | testfailed | i-076cf9bae1963aa54 | debug     | web-platform-tests-e10s-3   | linux64                 | 3ce14ff1acd5 |
+---------------------+------------+---------------------+-----------+-----------------------------+-------------------------+--------------+

I don't think this proves we have common failures, but it does hint that maybe our failures are related- possibly there is a exit condition that causes us to exit not so clean, then we hit this error condition outlined in the bug here, and that cleans up properly.
this seems to be autoland specific, when we run a job on android and autoland they fail with the same pattern, but the same machine running an inbound job passes right away:
https://pastebin.mozilla.org/8920140
I'm not surprised that the same machine didn't have a problem on a different branch because the caches are separated between branches so autoland caches are not used for inbound.  What is surprising is how much it's skewed towards autoland, specifically android on autoland.
Assignee: nobody → gbrown
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1311442

Comment 17

2 years ago
70 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 70

Platform breakdown:
* android-4-3-armv7-api15: 61
* android-api-15-gradle: 4
* linux64: 3
* android-4-2-x86: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304930&startday=2016-10-17&endday=2016-10-23&tree=all
You need to log in before you can comment on or make changes to this bug.