Open Bug 1417003 Opened 2 years ago Updated 1 month ago

Mach breaks colors in cargo output on Windows

Categories

(Firefox Build System :: General, defect)

defect
Not set

Tracking

(Not tracked)

People

(Reporter: glandium, Unassigned)

References

Details

Attachments

(3 files)

$ ./mach build toolkit/library/rust/target
 0:01.44 c:\mozilla-build\mozmake\mozmake.EXE -C c:/Users/Administrator/gecko/obj-i686-pc-mingw32 -j16 -s backend
 0:01.88 c:\mozilla-build\mozmake\mozmake.EXE -C toolkit/library/rust -j16 -s target
 0:01.91 force-cargo-library-build
 0:02.46 [m[m[36m[1m    Blocking[m waiting for file lock on build directory

$ mozmake -s -C obj-i686-pc-mingw32/toolkit/library/rust/
force-cargo-library-build
    Blocking waiting for file lock on build directory

(with proper coloring on "Blocking" ; don't worry about the blocking message, that's expected, I had another build running at the same time, which is where I noticed the problem)

FWIW, this is on a Windows 2016 EC2 instance freshly installed with mozilla-build 3.0 (because 3.1 seems to have problems) and MSVC 2017. The terminal is whatever start-server.bat ends up starting, which looks like it's cmd's but 2016 presumably has some advanced one, like Windows 10.
As a matter of fact, I just figured it also happens on my old Windows 2012 VM, which has an older cmd.exe terminal.
(In reply to Mike Hommey [:glandium] from comment #1)
> As a matter of fact, I just figured it also happens on my old Windows 2012
> VM, which has an older cmd.exe terminal.

I think this might be caused by a change in cargo. We run cargo with `--color=always` when mach's output is a terminal, and it looks like cargo has started outputting ANSI escape codes on Windows now. This seems to have changed in the Rust 1.20 release:

$ cargo +1.19.0 build --color=always 2>&1 | od -c
0000000               C   o   m   p   i   l   i   n   g       i   t   o
0000020   a       v   0   .   3   .   4  \n               C   o   m   p
0000040   i   l   i   n   g       s   n   i   p   p   e   t       v   0
0000060   .   1   .   0       (   f   i   l   e   :   /   /   /   C   :
0000100   /   b   u   i   l   d   /   s   n   i   p   p   e   t   )  \n
0000120                   F   i   n   i   s   h   e   d       d   e   v
0000140       [   u   n   o   p   t   i   m   i   z   e   d       +
0000160   d   e   b   u   g   i   n   f   o   ]       t   a   r   g   e
0000200   t   (   s   )       i   n       0   .   5   2       s   e   c
0000220   s  \n
0000222

$ cargo +1.20.0 build --color=always 2>&1 | od -c
0000000 033   [   m 033   [   m 033   [   3   2   m 033   [   1   m
0000020           C   o   m   p   i   l   i   n   g 033   [   m       i
0000040   t   o   a       v   0   .   3   .   4  \n 033   [   m 033   [
0000060   m 033   [   3   2   m 033   [   1   m               C   o   m
0000100   p   i   l   i   n   g 033   [   m       s   n   i   p   p   e
0000120   t       v   0   .   1   .   0       (   f   i   l   e   :   /
0000140   /   /   C   :   /   b   u   i   l   d   /   s   n   i   p   p
0000160   e   t   )  \n 033   [   m 033   [   m 033   [   3   2   m 033
0000200   [   1   m                   F   i   n   i   s   h   e   d 033
0000220   [   m       d   e   v       [   u   n   o   p   t   i   m   i
0000240   z   e   d       +       d   e   b   u   g   i   n   f   o   ]
0000260       t   a   r   g   e   t   (   s   )       i   n       0   .
0000300   4   5       s   e   c   s  \n
0000310

If I pipe those through `cat` I get color in 1.20 and no color in 1.19. In any event, mach is somehow mangling the escape sequences.
I can reproduce this with this little bit of Python:
>>> import subprocess
>>> o,e = subprocess.Popen(['cargo', '+1.20.0', 'build', '--color=always'], stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()
>>> o
''
>>> e
'\x1b[m\x1b[m\x1b[32m\x1b[1m    Finished\x1b[m dev [unoptimized + debuginfo] target(s) in 0.0 secs\n'
>>> import sys
>>> sys.stdout.write(e)
[m[m[32m[1m    Finished[m dev [unoptimized + debuginfo] target(s) in 0.0 secs

A bit of googling revealed this WSL issue comment:
https://github.com/Microsoft/WSL/issues/1356#issuecomment-260704549

Trying that out I can make it work by calling SetConsoleMode:
```
import ctypes
import sys

ENABLE_PROCESSED_OUTPUT = 0x0001
ENABLE_VIRTUAL_TERMINAL_PROCESSING = 0x0004
STD_OUTPUT_HANDLE = -11
stdout = ctypes.windll.kernel32.GetStdHandle(STD_OUTPUT_HANDLE)
ctypes.windll.kernel32.SetConsoleMode(stdout, ENABLE_PROCESSED_OUTPUT | ENABLE_VIRTUAL_TERMINAL_PROCESSING)

s = '\x1b[m\x1b[m\x1b[32m\x1b[1m    Finished\x1b[m dev [unoptimized + debuginfo] target(s) in 0.0 secs\n'
sys.stdout.write(s)
```

Running this program on my Windows 10 machine prints the colors properly.
I'm pretty sure this cargo change is what changed the behavior:
https://github.com/rust-lang/cargo/commit/f8fb0a022853ab3f423889d442c695511ece3692
"Migrate from the `term` crate to `termcolor` "

GitHub shows that that landed in cargo 0.21, which shipped with Rust 1.20, and testing with ripgrep (which is the source of the termcolor crate) shows the same behavior. I guess it's not unreasonable, we are explicitly asking for colors, so it has to give them to us as escape sequences. We should be able to use some variant of the code I pasted above to enable this so it works properly.

Running cargo standalone doesn't have this problem because if its stdout/stderr are an actual console it can use the Win32 Console APIs for color instead of escape sequences.
I tried the obvious thing, using the Python code I wrote above, and it didn't fix the issue with cargo. Putting a test print of the string-with-escapes from that same sample code showed that that works, so there must be something else in mach or elsewhere breaking things here. It's possible some layer of our output processing is mucking things up.
IIRC the behavior of the `termcolor` crate correctly if you specify `--color=always` and it doesn't detect a terminal it'll still avoid colorization. It will listen to environment variables, though, so maybe the TERM env var is being set by something and Cargo is interpreting that as "oh now I'm a unix-like terminal"?
The msys terminal does have `TERM=cygwin` on my system, but I don't think that helps much in that we actually *do* want color output to work in this situation!
Component: mach → Build Config
Product: Core → Firefox Build System
See Also: → 1447096
Experimenting from comment 3, it seems that mozmake changes the console mode somehow. When mozmake returns, the console mode is reverted to 3L (rather than 5L being set before).

Interesting that, if mozmake runs cargo, it seems the console mode would be set to 7L if the initial state is 3L (maybe cargo changes the console mode as well). I have no idea why, and why this doesn't work in the same way in mach.
So, cargo, or specifically, termcolor, will change the console mode and make it able to output color.

If I invoke "mozmake -j8 -s -C objdir binaries" via a simply invoking Popen, on a build where only rust part is dirty, it indeed has the expected colored output. Also, if I forcefully make pass_thru=True in ProcessExecutionMixin.run_process, there is also colored output.

So it seems clear that, one source of the problem is mozprocess, which somehow invalidate everything. The code in mozprocess is rather complicated and I failed to follow how the output is handled at all.

On the other hand, if the backend is dirty as well (i.e. some configure files are changed, as I do during the experiment), and the backend regeneration happens before cargo build, the color disappears as well, even if Popen is used. So there is at least another source of problem in the regeneration process somehow.

The second part feels rather mysterious to me, since in that case termcolor should have been able to correctly set the console mode and have output colored, but it fails somehow?
I don't think it's mozprocess mucking things up here. I wrote a simple test program in C:
```
#include <stdio.h>
#include <Windows.h>

int main(int argc, char** argv)
{
  const char s[] = "\x1b[m\x1b[m\x1b[32m\x1b[1m    Finished\x1b[m dev [unoptimized + debuginfo] target(s) in 0.0 secs\n";
  HANDLE hStdout = GetStdHandle(STD_OUTPUT_HANDLE);
  DWORD mode;
  DWORD written;
  GetConsoleMode(hStdout, &mode);
  SetConsoleMode(hStdout, mode | ENABLE_VIRTUAL_TERMINAL_PROCESSING);
  (void)fwrite(s, sizeof(s), 1, stdout);
  return 0;
}
```

Building and running that locally produces colored output. I wrote a simple Python wrapper to invoke it using mozprocess:
```
#!/usr/bin/env python

from __future__ import print_function, unicode_literals

import ctypes
import ctypes.wintypes
import sys

ENABLE_VIRTUAL_TERMINAL_PROCESSING = 0x0004
STD_OUTPUT_HANDLE = -11
stdout = ctypes.windll.kernel32.GetStdHandle(STD_OUTPUT_HANDLE)
mode = ctypes.wintypes.DWORD()
ctypes.windll.kernel32.GetConsoleMode(stdout, ctypes.byref(mode))
ctypes.windll.kernel32.SetConsoleMode(stdout, mode.value | ENABLE_VIRTUAL_TERMINAL_PROCESSING)

import sys
from mozprocess.processhandler import ProcessHandlerMixin

def handleLine(line):
    # Converts str to unicode on Python 2 and bytes to str on Python 3.
    if isinstance(line, bytes):
        line = line.decode(sys.stdout.encoding or 'utf-8', 'replace')
    print(line)

args = sys.argv[1:]
p = ProcessHandlerMixin(args,
                        processOutputLine=[handleLine],
                        universal_newlines=True)
p.run()
p.processOutput()
p.wait()
```

...and that still works:
https://imgur.com/a/WjW9rpX

The handleLine function is taken almost verbatim from the code that mach uses for invoking processes via mozprocess:
https://dxr.mozilla.org/mozilla-central/rev/4e56a2f51ad739ca52046723448f3129a58f1666/python/mach/mach/mixin/process.py#81


When `mach build` starts the build it invokes make and passes as its line handler `BuildOutputManager.on_line`:
https://searchfox.org/mozilla-central/rev/0f4d50ff5211e8dd85e119ef683d04b63062ed90/python/mozbuild/mozbuild/controller/building.py#1144

That method passes the line to `self.log`:
https://searchfox.org/mozilla-central/rev/0f4d50ff5211e8dd85e119ef683d04b63062ed90/python/mozbuild/mozbuild/controller/building.py#681

which goes through the base class `OutputManager` somewhere:
https://searchfox.org/mozilla-central/rev/0f4d50ff5211e8dd85e119ef683d04b63062ed90/python/mozbuild/mozbuild/controller/building.py#610

I suspect something in the logging hierarchy is breaking things, but it gets a little too twisty for me to follow.
If you invoke the build system via `mach`, isatty() always fails since the stdio file descriptors are pipes from mach and not the original stdio file descriptors. And since most programs use isatty() to check for enabling color, color gets disabled.

We've hacked around this in various places by forcing commands invoked by the build system to enable color - but only if mach itself is attached to a TTY. Search for instances of MACH_STDOUT_ISATTY in the repo.

We also have the pass_thru=True option for invoked processes from mach/mozprocess/Python. If set, it sends the stdio file descriptors to the invoked process instead of using pipes, thus allowing native terminal detection to "just work."
This time, I suspect the reason is on env.

I use the following script:
```python
import ctypes
import ctypes.wintypes
import sys
import subprocess

stdout = ctypes.windll.kernel32.GetStdHandle(-11)
ctypes.windll.kernel32.SetConsoleMode(stdout, 7)

def print_mode():
    mode = ctypes.wintypes.DWORD()
    ctypes.windll.kernel32.GetConsoleMode(stdout, ctypes.byref(mode))
    print(mode)

args = sys.argv[1:]
print_mode()
p = subprocess.Popen(args, universal_newlines=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
for line in p.stdout:
    print(line.rstrip())
print_mode()
```

and program:
```c
#include <stdio.h>
int main(int argc, char** argv)
{
  printf("\x1b[m\x1b[m\x1b[32m\x1b[1m    Finished\x1b[m dev [unoptimized + debuginfo] target(s) in 0.0 secs\n");
  return 0;
}
```

If I run
> python test.py program.exe
then the output is correctly colored. However, if I do
> python test.py env program.exe
the output doesn't have color, and it can be seen that the console mode is also switched from 7L at the beginning to 3L after the running.

And indeed env seems to be a problem here, if I run
> python test.py env
directly, the console mode is also switched, unlike what can be seen from running program.exe alone.

I also tested the env.exe from msys2 which has a much higher version then ours, and the problem persists.

That probably means we may want to replace env with some tools we can control and we can ensure the console mode is set correctly there.

I had a brief look at source code of env, but cannot see anything interesting immediately. Probably the problem hides inside some initialization code or so. It's also interesting that, when the stdout is redirected, `GetStdHandle(STD_OUTPUT_HANDLE)` doesn't return the true console handle, and and I have no idea how they can get the correct handle to change the console mode...
Attached file SetConsoleMode.patch
I've tracked it down to fhandler_console::output_tcsetattr in MSYS [1], which is called from within fhandler_console::init. For some reason it always inits the console to the default ENABLE_PROCESSED_OUTPUT | ENABLE_WRAP_AT_EOL_OUTPUT. Patching out the call to SetConsoleMode allows Ted's patch to work. I don't see any straightforward way around this. I've attached a bsdiff patch, the only change is 0x90 5 times starting at 0x17475.

[1] https://github.com/msysgit/msys/blob/96f67347bc237ca689ad6f13e634d49f9de38cab/winsup/cygwin/fhandler_console.cc#L736
Attachment #9001467 - Attachment is patch: true
Attachment #9001467 - Attachment mime type: application/octet-stream → text/plain
Attachment #9001467 - Attachment is patch: false
Attachment #9001467 - Attachment mime type: text/plain → application/octet-stream
Attached file Patched msys-1.0.dll
Here's the actual patched DLL for those who don't want to dig for bspatch.
Wow, nice detective work!
I did a little bit of testing to see if we could just remove `env` from that commandline, but it turns out that that we still have the same problem, presumably because make winds up executing the whole `cargo` commandline via the shell, which presumably does the same thing. I tested with xidorn's test.py to verify that:
```
$ python test.py /bin/sh -c vt.exe
c_ulong(7L)
[m[m[32m[1m    Finished[m dev [unoptimized + debuginfo] target(s) in 0.0 secs

c_ulong(3L)
```
Does that mean we should redo bug 1399882 (for extra win beyond working in WSL)?
Just wanted to note that I can see colors (instead of control characters) when using ConEmu as my cmd.exe terminal. So people may want to try that as a workaround; plus ConEmu adds lots of nice features anyway: https://conemu.github.io/

I do see the control characters when building from vscode, so I'm still looking forward to a solution! Thanks to everybody working on this.
(In reply to Nathan Froyd [:froydnj] from comment #17)
> Does that mean we should redo bug 1399882 (for extra win beyond working in
> WSL)?

I think that doing so would let us work around this issue, certainly.

(In reply to Gerald Squelart [:gerald] from comment #18)
> I do see the control characters when building from vscode, so I'm still
> looking forward to a solution! Thanks to everybody working on this.

This is related but separate--presumably we're forcing cargo to output color and the VSCode output pane doesn't support VT escapes. We have code in mach that tries to detect if stdout is a console and if so, forces color output for build tools (since it's running all build output through a pipe):
https://dxr.mozilla.org/mozilla-central/rev/ac9f1219d11bf1a56ec1ace8e3ba9ff113b5cacb/python/mach/mach/main.py#352
Duplicate of this bug: 1535831
You need to log in before you can comment on or make changes to this bug.