Closed Bug 896718 Opened 9 years ago Closed 9 years ago

Mac 10.6 & 10.7 test-runners have multiple instances of "unable to execute llvm-gcc-4.2: No such file or directory" / "error: command 'llvm-gcc-4.2' failed with exit status 1"

Categories

(Release Engineering :: General, defect)

All
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dholbert, Assigned: emorley)

References

Details

Attachments

(1 file)

Mac debug test logs (e.g. mochitest and JS reftest logs, at least) have multiple instances of...
> 14:21:35     INFO -  unable to execute llvm-gcc-4.2: No such file or directory
> 14:21:35     INFO -  error: command 'llvm-gcc-4.2' failed with exit status 1
...which don't seem to turn the run orange, but which do get highlighted by the log-highlighter. (and hence confuse things when the log is orange for other reasons)

In any case, we probably shouldn't be invoking executables that don't exist.

I'm seeing this in opt & debug logs, for OS X 10.6 and 10.7, on at least mochitest-[1-5] runs. (Doesn't seem to affect OS X ver 10.8, though.

Sample logs:
 https://tbpl.mozilla.org/php/getParsedLog.php?id=25576506&tree=Mozilla-Central
 https://tbpl.mozilla.org/php/getParsedLog.php?id=25575411&tree=Mozilla-Central
 https://tbpl.mozilla.org/php/getParsedLog.php?id=25575637&tree=Mozilla-Central
OS: Linux → Mac OS X
Hardware: x86_64 → All
Summary: Mac test-runners have multiple instances of "unable to execute llvm-gcc-4.2: No such file or directory" / "error: command 'llvm-gcc-4.2' failed with exit status 1" → Mac 10.6 & 10.7 test-runners have multiple instances of "unable to execute llvm-gcc-4.2: No such file or directory" / "error: command 'llvm-gcc-4.2' failed with exit status 1"
I don't think that we have any compiler installed in the test machines...
We don't, nor should we.

This is the issue of psutil failing to install/build... cc: aki and gps due to that
found in triage.(In reply to Justin Wood (:Callek) from comment #2)
> We don't, nor should we.
> 
> This is the issue of psutil failing to install/build... cc: aki and gps due
> to that

aki: thanks.

gps: how to fix this so developers are not impacted? In case this is related to psutil, note that we explicitly+intentionally do not have build/compiler tools on our test machines.
Afaik, short term we can make mozharness eat the errors (not log them or output them to the terminal), or turn off resource monitoring globally.  Longer term we can either get specific configs per-platform per-jobtype that turn on/off resource monitoring as appropriate, and/or get psutil installing successfully everywhere,
When resource monitoring landed, I explicitly asked a bunch of people (RelEng + Sheriffs) if the non-tbpl-run-turning errors in the logs were acceptable until psutil is installed globally (bug 894950 and bug 893254 track that) and the consensus was "yes." So, resource monitoring landed despite it introducing errors in the logs.

I concede the errors are annoying. I would like to see them go away.

I would prefer they go away by installing psutil everywhere.

I don't want us to back out resource monitoring because we're actively using data it is providing. For example, bug 877054 is attempting to find the optimal parallel execution count of xpcshell tests for minimal wall execution time. Bug 895225 was filed to investigate why xpcshell tests are performing a lot more write I/O than most of us expected and appears to indicate xpcshell tests are I/O and not CPU bound, a complete surprise to many!

I don't want to create a fire drill for RelEng to install psutil everywhere. That being said, bug updates make it sound like work is being done on this front. If that's the case and it will be ready soon, then these errors will magically go away and this bug can be marked as a dupe. If the ETA isn't soon enough, then I suppose we can have mozharness swallow the error.

If we need to modify mozharness, I can do this since it was me who introduced the issue. But, I'd like to make sure I'm not implementing a workaround that will only be needed for a few days first.
Depends on: 859573
Flags: needinfo?(gps)
Depends on: 893254
Ok, luckily it seems like tbpl is only picking up the 

    14:11:04    ERROR - Return code: 1

line as the errors we want to ignore.

We can use the run_command() success_codes to specify which exit codes we deem a success:
http://hg.mozilla.org/build/mozharness/file/a5daac81696b/mozharness/base/script.py#l536
http://hg.mozilla.org/build/mozharness/file/a5daac81696b/mozharness/base/script.py#l645

Since we already have an 'optional' bool, we can pass a success_codes of [0, 1] if optional here:
http://hg.mozilla.org/build/mozharness/file/a5daac81696b/mozharness/base/python.py#l240

That should turn the above log line into

    14:11:04    INFO - Return code: 1


(If we had been picking up any other errors, we might have had to create a special error_list with a substr or regex to IGNORE, like this:
http://hg.mozilla.org/build/mozharness/file/a5daac81696b/mozharness/base/signing.py#l28
Luckily, we don't have to here.)
Attached patch Patch v1Splinter Review
Allow a return code of 1 when installing optional packages to prevent false positives in the log.
Attachment #780324 - Flags: review?(aki)
Assignee: nobody → emorley
Status: NEW → ASSIGNED
Attachment #780324 - Flags: review?(aki) → review+
Merged to production :-)
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.