Open Bug 1027665 Opened 5 years ago Updated Last year

Fix interaction between mach and structured logging

Categories

(Firefox Build System :: Mach Core, enhancement)

x86_64
Linux
enhancement
Not set

Tracking

(Not tracked)

People

(Reporter: jgraham, Unassigned)

References

Details

Attachments

(1 file, 2 obsolete files)

At the moment mach and structured logging don't play together terribly well. Mach has the concept of a Logging Manager which seems to basically fill the role of ensuring that logging to stdout is set up for all loggers being used. It has some notion of a "structured logger" but I think that's based on the older design.

With the new design of structured logging, one feature is that you can specify the type of logging you want from the command line. This doesn't interact well with mach enforcing its own handler; anything specified on the command line will be output in addition to the mach-specified default.

The right way to do this seems to be to look for command line logging options and, if there aren't any, use some sane default options. Mach could provide this, or each consumer could set its own defaults (which is actually already part of the structuredlogging module). Possibly mach should enforce that the stdlib adapter is set up, but it really doesn't seem like it needs to do that much else.
Blocks: 94522
Blocks: 945222
No longer blocks: 94522
Blocks: 1036869
Two nice to have features for this mentioned today were to have mach automatically log the raw json to a certain location that could be accessed later by "mach format-html-log" or similar commands to see a test run's report in a different format after the fact.
See Also: → 1050855
Attached patch mach_structured_log.diff (obsolete) — Splinter Review
This seems like a reasonable bug to use for work to convert mach to use structured logging. ahal is interested in picking that up, so attaching the preliminary work I did.
Nice, after a quick look, seems like there is at least the following left to do:

1. Add mozlog commandline integration to global mach arguments
2. Allow individual commands to specify which formatter to use by default (e.g |mach build| has it's own special format)
3. For commands that do support structured logging already (mochitest, xpcshell, etc), make sure we pass in the existing mach logger rather than create a new one.
4. Do some profiling to make sure mozlog isn't making things any slower than logging

After this is landed, there is a lot of cool follow-up work we can do:

1. Save raw log on the side for a "format-log" like command (comment 1)
2. Generalize TerminalHandler/Formatter and move it into mozlog
3. Improve summaries when running multiple test suites at a time via |mach test|
4. Replace all the print statements in mach_commands.py with proper log calls (ok, we could do this now too..)
(In reply to Andrew Halberstadt [:ahal] from comment #3)
> After this is landed, there is a lot of cool follow-up work we can do:
> 
> 1. Save raw log on the side for a "format-log" like command (comment 1)

Saving a raw log is already done by mach, so this would actually need to be done by *this* bug, so that the currently existing feature doesn't break. See bug 985857. At least, the minimum should be done so that bug 985857 doesn't regress.
Depends on: 1186599
Depends on: 1186888
Assignee: nobody → ahalberstadt
Status: NEW → ASSIGNED
Attached patch wip (obsolete) — Splinter Review
Here's a WIP that works with |mach build| and is mostly complete. There's still some cleanup to do:

1. Figure out if those old build actions I converted into 'log' actions actually need to be preserved or not.
2. Fix up |mach show-log|
3. Add valid_formatters to more mach commands (especially testing commands)

I'm currently running full clobber builds with and without this patch to see if mozlog has any performance impact.
Attachment #8631638 - Attachment is obsolete: true
Full clobber without mozlog: 18:17.67
Full clobber with mozlog: 18:43.14

Hard to tell if that's statistically significant with sample size of 1 :p.. I'll do some testing on try later, but for now I'm satisfied that mozlog doesn't make everything horribly slow.
FWIW, on my machine, full clobber build times are within a couple seconds. 26 seconds would indicate a regression.
(In reply to Mike Hommey [:glandium] from comment #7)
> FWIW, on my machine, full clobber build times are within a couple seconds.
> 26 seconds would indicate a regression.

Good to know. I'll definitely do some profiling and more trials on try, I suspect the difference on windows will be bigger too.
Hey gps, so mozlog has some fundamental differences in its approach to structured logging compared to the mach logger. Namely:

1. Formatting is distinct from the log call, consumers don't pass format strings into the logger.
2. In order to have inter-changeable formatters, the actions are 'well defined'. Each action is a property on the structured logger object, and trying to log an 'arbitrary action' like mach does, is an error.

So I'm wondering what to do with the 'action' strings on some of the invokers of mach's logger. E.g 'artifact' in:
https://dxr.mozilla.org/mozilla-central/source/python/mozbuild/mozbuild/artifacts.py#127

I did some dxr'ing and it seems like nothing is consuming any of those strings yet, and removing them doesn't seem to break anything, but they may still have value. I see three solutions:

1. Convert all of those calls into mozlog 'log' actions. This would lose the action string (though we could potentially create an 'extra' field on the 'log' action and put it in there).

2. Create a special 'build_step' action that takes the action string as the first parameter. This would mean a requirement for creating new special 'build' formatters (or updating existing formatters to handle 'build_step').

3. Create a new action for any of the important steps. E.g, logger.artifact(params). I'm not a huge fan of this approach, but it's more similar to how the test actions work.

Any thoughts? FWIW, I've already implement #1 in the patch above, and everything seems to work.
Flags: needinfo?(gps)
AFAICT all consumers of mach's structured logger log via mach.mixin.logging.LoggingMixin.log(). Furthermore, I think you are correct in that nobody actually consumes any of the rich data it is generating. Although, I'm pretty sure we rely on the passed formatting strings to format e.g. build output properly. See the Structured*Formatter classes in mach.logging.

It seems to me that mozlog's logging is completely independent of Python "logging" package. Is that correct? If so, I think the easy solution here is to add a method on mach.logging.LoggingManager that replaces all log handlers with one that writes to mozlog. Then, logging to mach/logging will get proxied to mozlog. Over time, we can replace mach's log producers with whatever the mozlog equivalent is. That being said, mach is used outside of mozilla-central. Ripping out "logging" integration completely is rather invasive and might create a lot of churn downstream. I'm not too concerned about introducing a mozlog dependency, however.

If that doesn't make sense, we may want to chat over Vidyo or something. mach's logging is overly-complicated and is engineered for things that I never had time to implement (such as rich build system log analysis). mozlog's structured logger was invented in the time since mach was made, so it's possible mozlog fits all the requirements.
Flags: needinfo?(gps)
Bug 1027665 - Basic work to make mach/mozbuild use a mozlog structured logger
Posted to mozreview to make looking at the diff a little nicer. It's still very much WIP and I plan to split it out into several different commits later.

(In reply to Gregory Szorc [:gps] from comment #10)
> AFAICT all consumers of mach's structured logger log via
> mach.mixin.logging.LoggingMixin.log(). Furthermore, I think you are correct
> in that nobody actually consumes any of the rich data it is generating.
> Although, I'm pretty sure we rely on the passed formatting strings to format
> e.g. build output properly. See the Structured*Formatter classes in
> mach.logging.

This is correct, at least for in-tree consumers. I "fixed" the formatting strings by just doing the format substitution at the source of the log call and passing in a string. Alternatively, I can do this in the LoggingMixin.log() function to keep the API the same.

> It seems to me that mozlog's logging is completely independent of Python
> "logging" package. Is that correct? If so, I think the easy solution here is
> to add a method on mach.logging.LoggingManager that replaces all log
> handlers with one that writes to mozlog. Then, logging to mach/logging will
> get proxied to mozlog. Over time, we can replace mach's log producers with
> whatever the mozlog equivalent is. That being said, mach is used outside of
> mozilla-central. Ripping out "logging" integration completely is rather
> invasive and might create a lot of churn downstream. I'm not too concerned
> about introducing a mozlog dependency, however.

Yes, it does not use logging at all. The patch above basically already does what you suggest, and yes it does rip out logging completely. As far as out of tree consumers go, I could keep the LoggingMixin.log API identical, so as long as out of tree consumers use that method, we can keep backwards compatibility. I'm not interested in supporting two separate logging systems however.

> If that doesn't make sense, we may want to chat over Vidyo or something.
> mach's logging is overly-complicated and is engineered for things that I
> never had time to implement (such as rich build system log analysis).
> mozlog's structured logger was invented in the time since mach was made, so
> it's possible mozlog fits all the requirements.

With the changeset above, mach uses mozlog exclusively (no logging module) and things like |mach build| output exactly what they used to. I was mostly wondering how important that rich data is for the time being. Mozlog certainly has the capability to log rich data which can be used for build log analysis. It'll just require adding some kind of new 'build_step' action. If you're interested in learning more about mozlog, I recommend perusing:
http://mozbase.readthedocs.org/en/latest/mozlog.html

If nothing is consuming that rich data, the easiest thing for me would be to drop it for now and file a followup to re-implement it via mozlog when we have a use for it. But I can also re-implement it as part of this patch if it's important to keep. So my question is, should I deal with the rich data now? Or file a bug and deal with it later?

If you have questions I'd be happy to vidyo chat, otherwise I don't feel a need as long as you can answer my question above.
Attachment #8638648 - Attachment is obsolete: true
Since there are no consumers of the rich mach logging data today, let's drop support and re-add it later, if necessary. This includes support for a "build_step" action or something. No need to file a follow-up. If someone needs it, they'll file a bug then.

We should keep mach's logging API for backwards compatibility so we don't break downstream consumers. But let's remove uses of LoggingMixin outside of python/mach.
(In reply to Gregory Szorc [:gps] from comment #13)
> Since there are no consumers of the rich mach logging data today

There is (kinda). mach show-log.
AFAICT mach show-log doesn't really consume the semantics of the rich data, it just uses the fact that it is stored as json objects, which will still be true (albeit in a slightly different format). So I agree |mach show-log| needs work, but I'm not sure it's an argument in either direction for adding build-specific semantics to the log files.
Here's a working try run (for |mach build| at least):
https://treeherder.mozilla.org/#/jobs?repo=try&revision=cf36beb4270b

It's difficult to compare runtimes on try though. Still seems a couple dozen seconds slower with the patch on my machine though. Hopefully I can figure out some ways to speed mozlog up.
Here's where I'm at perf wise.

First, there's too many factors and too much variance to get meaningful numbers out of try. Second, builds on my machine (Fedora 22 64bit) are not within a few seconds like comment 7 suggests, I've seen +/- 30s on the same build. My steps between builds are:

  $ mach clobber
  $ ccache -C
  $ mach build

Should I be doing something else? That being said, with this patch applied, builds did seem to be running slower on average by around 30s. So I did some profiling with pprofile [1] and kcachegrind [2]. Turns out there wasn't much in the way of obvious low hanging fruit, just a cumulative slowdown caused by the fact that mozlog is doing more stuff than logging ever did.

Aside from some minor improvements here and there, there were two places I could shave a good chunk of time off. The first was by using ujson [3] instead of stdlib json. This saved about 7s on an 18 minute build. The second was by refactoring the function that draws the TIER footer (this isn't mozlog related, but I found we were sometimes spending up to a minute in there, and got it down to about 30s). With these changes, the mozlog builds now appear to be slightly faster than before (by about 15s). I'll file blocking bugs to land these improvements separately, as they aren't really related to the work in here.

As an aside, I wonder how hard it would be to get mozprocess to use processes instead of threads for handling output. It's kind of a shame the build is impacted by the python logging framework at all.

[1] https://github.com/vpelletier/pprofile
[2] http://kcachegrind.sourceforge.net/html/Home.html
[3] https://github.com/esnme/ultrajson
Depends on: 1193522, 1194415
See Also: → 1338240
Product: Core → Firefox Build System
Assignee: ahal → nobody
Status: ASSIGNED → NEW
You need to log in before you can comment on or make changes to this bug.