Closed Bug 1669862 Opened 4 years ago Closed 4 years ago

Build errors for pushing patches when forgetting to `hg add` or `git add` new files are mystifying

Categories

(Firefox Build System :: General, defect, P3)

defect

Tracking

(firefox84 fixed)

RESOLVED FIXED
84 Branch
Tracking Status
firefox84 --- fixed

People

(Reporter: Gijs, Assigned: mhentges)

Details

(Keywords: in-triage)

Attachments

(1 file)

package-tests> Traceback (most recent call last):

package-tests> make[2]: *** [Makefile:119: install-test-files] Error 1

make[1]: *** [/builds/worker/checkouts/gecko/build/moz-automation.mk:110: automation/package-tests] Error 2

make[1]: *** Waiting for unfinished jobs....

make: *** [client.mk:129: build] Error 2

Return code: 2

'mach build -v' did not run successfully. Please check log for err

This (from https://treeherder.mozilla.org/#/jobs?repo=try&selectedTaskRun=PvJ6H6YWQcOWKurrVOwLAw.0&revision=16067f157aed40bb27aef0e40c96010045d6ed3f ) is kinda useless.

The thing you actually want is not highlighted in the log viewer, and not extracted into the Failure Summary on treeherder

[task 2020-10-07T22:26:02.666Z] 22:26:02 INFO - package-tests> mozpack.errors.ErrorMessage: Symlink target path does not exist: /builds/worker/checkouts/gecko/toolkit/components/pictureinpicture/tests/test-media-stream.html

Is this a bug/feature request for Treeherder/Taskcluster/whatever? I'm not aware of anything we can do on the build system side to directly affect whether any log output gets highlighted in the log viewer or the failure summary on Treeherder.

(In reply to Ricky Stewart from comment #1)

Is this a bug/feature request for Treeherder/Taskcluster/whatever? I'm not aware of anything we can do on the build system side to directly affect whether any log output gets highlighted in the log viewer or the failure summary on Treeherder.

It used to be that treeherder / log processing basically cued off magic strings ("TEST-UNEXPECTED-FAIL" or similar) to decide what got displayed, so fixing this would be as easy as making sure we use those and/or the right log level for the relevant error.

I don't know what the current system is like and which actors have which level of control here. Hopefully :ahal knows more.

I think treeherder is important here, though even for local build system logging it'd be nice if this error was actually at/near the bottom of log output.

Flags: needinfo?(ahal)

OK, so having looked at the log, the issue is that the FIRST line of the exception output is written at a different log level than the rest of the trace:

[task 2020-10-07T22:26:49.198Z] 22:26:49    ERROR -  package-tests> Traceback (most recent call last):
[task 2020-10-07T22:26:49.198Z] 22:26:49     INFO -  package-tests>   File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
[task 2020-10-07T22:26:49.198Z] 22:26:49     INFO -  package-tests>     "__main__", mod_spec)
[task 2020-10-07T22:26:49.198Z] 22:26:49     INFO -  package-tests>   File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
[task 2020-10-07T22:26:49.198Z] 22:26:49     INFO -  package-tests>     exec(code, run_globals)
[task 2020-10-07T22:26:49.198Z] 22:26:49     INFO -  package-tests>   File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/action/process_install_manifest.py", line 115, in <module>
[task 2020-10-07T22:26:49.199Z] 22:26:49     INFO -  package-tests>     log_build_task(main, sys.argv[1:])
[task 2020-10-07T22:26:49.199Z] 22:26:49     INFO -  package-tests>   File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/action/util.py", line 18, in log_build_task
[task 2020-10-07T22:26:49.199Z] 22:26:49     INFO -  package-tests>     return f(*args, **kwargs)
[task 2020-10-07T22:26:49.199Z] 22:26:49     INFO -  package-tests>   File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/action/process_install_manifest.py", line 101, in main
[task 2020-10-07T22:26:49.199Z] 22:26:49     INFO -  package-tests>     defines=args.defines)
[task 2020-10-07T22:26:49.199Z] 22:26:49     INFO -  package-tests>   File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/action/process_install_manifest.py", line 70, in process_manifest
[task 2020-10-07T22:26:49.200Z] 22:26:49     INFO -  package-tests>     remove_empty_directories=remove_empty_directories)
[task 2020-10-07T22:26:49.200Z] 22:26:49     INFO -  package-tests>   File "/builds/worker/checkouts/gecko/python/mozbuild/mozpack/copier.py", line 424, in copy
[task 2020-10-07T22:26:49.200Z] 22:26:49     INFO -  package-tests>     copy_results.append((destfile, f.copy(destfile, skip_if_older)))
[task 2020-10-07T22:26:49.200Z] 22:26:49     INFO -  package-tests>   File "/builds/worker/checkouts/gecko/python/mozbuild/mozpack/files.py", line 391, in copy
[task 2020-10-07T22:26:49.200Z] 22:26:49     INFO -  package-tests>     raise ErrorMessage('Symlink target path does not exist: %s' % self.path)
[task 2020-10-07T22:26:49.200Z] 22:26:49     INFO -  package-tests> mozpack.errors.ErrorMessage: Symlink target path does not exist: /builds/worker/checkouts/gecko/toolkit/components/pictureinpicture/tests/test-media-stream.html
[task 2020-10-07T22:26:49.200Z] 22:26:49     INFO -  package-tests> Makefile:119: recipe for target 'install-test-files' failed

... so the Traceback (most recent call last): line gets captured in the failure summary on Treeherder, but unfortunately it's the least informative/helpful line. :( I'm not sure whether this is within the purview of the build system, but we can double-check.

I think treeherder is important here, though even for local build system logging it'd be nice if this error was actually at/near the bottom of log output.

make is in control of this, so it's easier said than done. We'd have to parse the make output (similar to what Treeherder does!) in the build system code and then re-print the "relevant" output again after the process ends.

Severity: -- → S3
Keywords: in-triage
Priority: -- → P3
Assignee: nobody → mhentges
Status: NEW → ASSIGNED

Thanks for the report :gijs, I'm going to see if it's possible for me to repro this locally so I can confirm how/why the logging severity level gets changed. Ricky hinted that there could be some mozharness trickery involved making this more difficult to see locally, but I'm going to roll those dice. I might bug you for some tips :)

Add mozpack.errors.ErrorMessage as something for testing/mozharness/mozharness/base/errors.py to catch. We may want to hide the traceback entirely there too.

I'll trust your judgement and "hail mary" an addition to errors.py (try push) :)
It's not clear to me yet if errors.py has a way to configure whether or not a traceback is shown - :glandium, is there an example exception that already has its traceback hidden?

The traceback would have to be hidden in the package script itself. IOW, process_install_manifest.py should catch the exception and print it nicelier. I'd suggest doing that for mozpack.errors.ErrorMessage exceptions only.
An alternative/extension might be to use mozpack.errors.accumulate., like action.zip does. That changes how the errors are output, but I /think/ errors.py catches those (but don't take my word for it).

Sounds like glandium and mhentges have a handle on this, thanks guys!

Flags: needinfo?(ahal)

I've currently got some other build issues buzzing around today, but this is definitely close to the front of my queue :)

I've got two adjustments sent to try here:

  1. errors.accumulate() without any base/errors.py doctoring: https://treeherder.mozilla.org/#/jobs?repo=try&revision=7adcad834a6485730649736f4b75082a7250fe16
  2. Explicitly catch ErrorMessage, print the message in the error format (without traceback), then exit 1: https://treeherder.mozilla.org/#/jobs?repo=try&revision=84c6d98edf2c7bb948a196baf5dae4076dfcc165

An alternative/extension might be to use mozpack.errors.accumulate., like action.zip does. That changes how the errors are output, but I /think/ errors.py catches those (but don't take my word for it).

Good idea, though I've discovered that mozpack directly does raise ErrorMessage('message'), not errors.error('message'). Since errors.accumulate() works by catching calls to errors.[fatal|error|warn](), it won't quite fit here unless more changes are made, and I don't know who all is using mozpack, so I want to avoid breaking changes.

The traceback would have to be hidden in the package script itself. IOW, process_install_manifest.py should catch the exception and print it nicelier. I'd suggest doing that for mozpack.errors.ErrorMessage exceptions only.

I'm going to roll with this technique, catching ErrorMessage, printing a specially-formed message and having errors.py consider that message an ERROR.

Thanks for the tips in this ticket! Without that pointer to errors.py, this would've been a lot harder to solve :)

EDIT: here's a successful try push with the error shown helpfully.

process_install_manifest now only prints the message (not the stack
trace) of ErrorMessage exceptions, and the error config has been updated
to consider such messages to have the "ERROR" severity.

(In reply to Mitchell Hentges [:mhentges] 🦀 from comment #11)

An alternative/extension might be to use mozpack.errors.accumulate., like action.zip does. That changes how the errors are output, but I /think/ errors.py catches those (but don't take my word for it).

Good idea, though I've discovered that mozpack directly does raise ErrorMessage('message'), not errors.error('message'). Since errors.accumulate() works by catching calls to errors.[fatal|error|warn](), it won't quite fit here unless more changes are made, and I don't know who all is using mozpack, so I want to avoid breaking changes.

Actually, those raise ErrorMessage shouldn't be there, they should be errors.error, but that wasn't caught on review.

Or errors.fatal, rather. As for breaking things, without errors.accumulate, errors.error or errors.fatal will just do a raise ErrorMessage of their own.

As for breaking things, without errors.accumulate, errors.error or errors.fatal will just do a raise ErrorMessage of their own.

Depending on the setting of errors._level for errors.error(), yeah.
Ok, I'll adjust from raise ErrorMessage to errors.fatal() 👍

:glandium, base/errors.py isn't recognizing errors.fatal(...) messages as having the ERROR severity, as is shown on this log line.
I can add a severity filter for the string "Error: ", but I'm worried that that will be too general? What do you think?

Flags: needinfo?(mh+mozilla)

(In reply to Mitchell Hentges [:mhentges] 🦀 from comment #16)

:glandium, base/errors.py isn't recognizing errors.fatal(...) messages as having the ERROR severity, as is shown on this log line.
I can add a severity filter for the string "Error: ", but I'm worried that that will be too general? What do you think?

Match on package.*> Error: ?

Flags: needinfo?(mh+mozilla)
Pushed by mhentges@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6487d7225949
Errors in process_install_manifest show nicely on TH r=firefox-build-system-reviewers,rstewart
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 84 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: