Closed Bug 1414060 Opened 7 years ago Closed 6 years ago

Build fails with UnicodeDecodeError on some non-English locales of Windows

Categories

(Firefox Build System :: General, defect)

defect
Not set
normal

Tracking

(firefox64 fixed)

RESOLVED FIXED
mozilla64
Tracking Status
firefox64 --- fixed

People

(Reporter: emk, Assigned: emk)

References

Details

Attachments

(2 files, 2 obsolete files)

1:23.24 js\src> Creating config.status
 1:23.24 js\src> Traceback (most recent call last):
 1:23.24 js\src>   File "e:/m/mozilla-unified/build\..\configure.py", line 127, in <module>
 1:23.24 js\src>     sys.exit(main(sys.argv))
 1:23.24 js\src>   File "e:/m/mozilla-unified/build\..\configure.py", line 34, in main
 1:23.24 js\src>     return config_status(config)
 1:23.25 js\src>   File "e:/m/mozilla-unified/build\..\configure.py", line 94, in config_status
 1:23.25 js\src>     partial_config.write_vars(sanitized_config)
 1:23.25 js\src>   File "e:\m\mozilla-unified\python\mozbuild\mozbuild\backend\configenvironment.py", line 362, in write_vars
 1:23.25 js\src>     self.substs._fill_group(substs)
 1:23.25 js\src>   File "e:\m\mozilla-unified\python\mozbuild\mozbuild\backend\configenvironment.py", line 259, in _fill_group
 1:23.25 js\src>     new_files.add(self._write_file(k, v))
 1:23.26 js\src>   File "e:\m\mozilla-unified\python\mozbuild\mozbuild\backend\configenvironment.py", line 244, in _write_file
 1:23.27 js\src>     json.dump(value, fh, indent=4)
 1:23.28 js\src>   File "d:\mozilla-build\python\Lib\json\__init__.py", line 189, in dump
 1:23.30 js\src>     for chunk in iterable:
 1:23.30 js\src>   File "d:\mozilla-build\python\Lib\json\encoder.py", line 419, in _iterencode
 1:23.31 js\src>     yield _encoder(o)
 1:23.31 js\src> UnicodeDecodeError: 'utf8' codec can't decode byte 0x83 in position 0: invalid start byte
Bug 1384557 + 1402012 caused the issue:

2017-11-03T20:57:41: INFO : Narrowed inbound regression window from [8f4db4cf, cb773c66] (4 builds) to [776ded75, cb773c66] (2 builds) (~1 steps left)
2017-11-03T20:57:41: DEBUG : Starting merge handling...
2017-11-03T20:57:41: DEBUG : Using url: https://hg.mozilla.org/integration/autoland/json-pushes?changeset=cb773c661e0ca0bf297e977343076bef34411523&full=1
2017-11-03T20:57:42: DEBUG : Found commit message:
Bug 1402012 - Update buildconfig.py to use PartialConfigEnvironment; r=glandium

By using the PartialConfigEnvironment, the clients of buildconfig will
depend on config.statusd/ files instead of config.status directly.
Clients can access substs and defines using buildconfig.substs['FOO'] or
buildconfig.defines['BAR'], and then collect file-level dependencies for
make using buildconfig.get_dependencies(). All GENERATED_FILES rules
already make use of this because file_generate.py automatically includes
these dependencies (along with all python modules loaded).

As a result of this commit, re-running configure will no longer cause
the world to be rebuilt. Although config.status is updated, no build
steps use config.status directly and instead depend on values in
config.statusd/, which are written with FileAvoidWrite. Since those
files are not official targets according to the make backend, make won't
try to continually rebuild the backend when those files are out of date.
And since they are FileAvoidWrite, make will only re-run dependent steps
if the actual configure value has changed.

As a result of using JSON to load data from the config.statusd
directory, substs can be unicode (instead of a bare string type).
generate_certdata.py converts the subst manually to a string so the
value can be exported to the environment without issue on Windows.

Additionally, patching the buildconfig.substs dict no longer works, so
the unit-symbolstore.py test was modified to patch the underlying
buildconfig.substs._dict instead.

The other files that needed to be modified make use of all the defines
for the preprocessor. Those that are used during 'mach build' now use
buildconfig.defines['ALLDEFINES'], which maps to a special
FileAvoidWrite file generated for the PartialConfigEnvironment.

MozReview-Commit-ID: 2pJ4s3TVeS8

2017-11-03T20:57:42: INFO : The bisection is done.
2017-11-03T20:57:42: INFO : Stopped
Blocks: 1402012
Flags: needinfo?(ted)
Flags: needinfo?(mshal)
Flags: needinfo?(mh+mozilla)
Flags: needinfo?(ted)
Flags: needinfo?(mshal)
Flags: needinfo?(mh+mozilla)
Assignee: nobody → VYV03354
Status: NEW → ASSIGNED
Hello,

I got exactly this error today while trying to build FF nightly on Windows 10 (in french) and MSVC 2017.
MSVC 2017 was also configured to use french (therefore involving unicode chars in compiler messages).

After some time, i have switched MSVC 2017 to use English only and was able to run configure / build without this error.

Thanks,
I'll backout bug 1384557 tonight unless someone responds.
Yes, please back this out. I can no longer build either.

For me the problematic environment variable is CL_INCLUDES_PREFIX which, on a Japanese system, has the string "メモ: インクルード ファイル" encoded as Shift-JIS, i.e. 0x83 0x81 0x83 0x82...
I tried changing the language in Visual Studio 2017 but it only offered Japanese or "Same as Windows" (i.e. Japanese), despite having installed both the Japanese and English language packs. I uninstalled the Japanese language pack and got further but now I hit:

 1:19.93 Traceback (most recent call last):
 1:19.93   File "c:\mozilla-build\python\Lib\runpy.py", line 174, in _run_module_as_main
 1:19.93     "__main__", fname, loader, pkg_name)
 1:19.93   File "c:\mozilla-build\python\Lib\runpy.py", line 72, in _run_code
 1:19.93     exec code in run_globals
 1:19.93   File "c:\moz\src1\python\mozbuild\mozbuild\action\file_generate.py", line 112, in <module>
 1:19.93     sys.exit(main(sys.argv[1:]))
 1:19.93   File "c:\moz\src1\python\mozbuild\mozbuild\action\file_generate.py", line 63, in main
 1:19.93     ret = module.__dict__[method](output, *args.additional_arguments)
 1:19.93   File "c:/moz/src1/python/mozbuild/mozbuild/action/process_define_files.py", line 56, in process_define_file
 1:19.93     for name, val in config.defines['ALLDEFINES'].iteritems()))
 1:19.94   File "c:\moz\src1\python\mozbuild\mozbuild\backend\configenvironment.py", line 290, in __getitem__
 1:19.94     raise KeyError("'%s'" % key)
 1:19.94 KeyError: u"'ALLDEFINES'"
Sorry, that last error appears to just be the result of the previous failed build. Clobbering fixed it.
Fixed by a backout, but I'll leave this open until I confirm that these patches are merged.

Please ask me or *actually* stand-up a msvc-ja taskcluster job before removing NONASCII.
Attachment #8925216 - Flags: review?(core-build-config-reviews)
Comment on attachment 8925217 [details]
Bug 1414060 - Give json.dump the correct encoding.

https://reviewboard.mozilla.org/r/196438/#review202504

::: python/mozbuild/mozbuild/backend/configenvironment.py:245
(Diff revision 1)
>  
>      def _write_file(self, key, value):
> +        encoding = 'mbcs' if sys.platform == 'win32' else 'utf-8'
>          filename = mozpath.join(self._datadir, key)
>          with FileAvoidWrite(filename) as fh:
> -            json.dump(value, fh, indent=4)
> +            json.dump(value, fh, indent=4, encoding=encoding)

Do these values then load properly from the json.load? It should be a quick test like this:

$ cat test.py
import buildconfig
print("NONASCII: %s" % buildconfig.substs['NONASCII'])

$ ./mach python test.py

That will go through the json.load() part of the code.
Summary: Build fails with UnicodeDecodeError on Japanese Windows → Build fails with UnicodeDecodeError on some non-English locales of Windows
Attachment #8925217 - Flags: review?(core-build-config-reviews)
Comment on attachment 8925217 [details]
Bug 1414060 - Give json.dump the correct encoding.

https://reviewboard.mozilla.org/r/196438/#review202504

> Do these values then load properly from the json.load? It should be a quick test like this:
> 
> $ cat test.py
> import buildconfig
> print("NONASCII: %s" % buildconfig.substs['NONASCII'])
> 
> $ ./mach python test.py
> 
> That will go through the json.load() part of the code.

Yes. (The actual output will depend on the locale.)

$ ./mach python test.py
NONASCII: 。。
Comment on attachment 8925216 [details]
Bug 1414060 - Re-add NONASCII.

https://reviewboard.mozilla.org/r/196436/#review202540

::: build/moz.configure/windows.configure:479
(Diff revision 1)
>  set_config('CL_INCLUDES_PREFIX', msvc_showincludes_prefix)
> +
> +# Make sure that the build system can handle non-ASCII characters
> +# in environment variables to prevent it from breaking silently on
> +# non-English systems.
> +set_config('NONASCII', b'\241\241')

Actually, is there an easy way to make this non-Windows specific? Or is this the best way to test that your actual use-case isn't broken?
(In reply to Masatoshi Kimura [:emk] from comment #11)
> Fixed by a backout, but I'll leave this open until I confirm that these
> patches are merged.
> 
> Please ask me or *actually* stand-up a msvc-ja taskcluster job before
> removing NONASCII.

Sorry, I removed that on glandium's advice in bug 1384557. I tested that patch after rebasing with a local build, but I suppose we would have broken this either way since we don't have a CI build testing it.

I did try to get a msvc-ja build working in a VM locally but I could not figure out how to do it without installing a full ja-JP Windows, which would make it much harder to stand up in Taskcluster. If you can figure out some steps to get MSVC to output Japanese error messages in an en-US Windows install such that we could run it in our existing VM images that would be extremely helpful!
Comment on attachment 8925216 [details]
Bug 1414060 - Re-add NONASCII.

https://reviewboard.mozilla.org/r/196436/#review202540

> Actually, is there an easy way to make this non-Windows specific? Or is this the best way to test that your actual use-case isn't broken?

> Actually, is there an easy way to make this non-Windows specific?

Did you mean "Windows specific"? I thought windows.configure is Windows-specific. No?

Or "non-English specific"? The purpose of NONASCII is breaking builds intentionally on English Windows whenever it would have been broken on non-English Windows. So it does not make sense to make it non-English specific.
Comment on attachment 8925216 [details]
Bug 1414060 - Re-add NONASCII.

https://reviewboard.mozilla.org/r/196436/#review202540

> > Actually, is there an easy way to make this non-Windows specific?
> 
> Did you mean "Windows specific"? I thought windows.configure is Windows-specific. No?
> 
> Or "non-English specific"? The purpose of NONASCII is breaking builds intentionally on English Windows whenever it would have been broken on non-English Windows. So it does not make sense to make it non-English specific.

So I was wondering if it's possible to have the NONASCII setting break the build on other platforms (eg: English builds on Linux or OSX) if it is breaking non-English builds. Or is it just specific to the Windows encoding? If so, what you have is fine.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #17)
> Sorry, I removed that on glandium's advice in bug 1384557. I tested that
> patch after rebasing with a local build, but I suppose we would have broken
> this either way since we don't have a CI build testing it.

NONASCII actually breaks English CI. See comment #5. Please do not remove it just because it does not catch all bugs in the universe.
Comment on attachment 8925216 [details]
Bug 1414060 - Re-add NONASCII.

https://reviewboard.mozilla.org/r/196436/#review202540

> So I was wondering if it's possible to have the NONASCII setting break the build on other platforms (eg: English builds on Linux or OSX) if it is breaking non-English builds. Or is it just specific to the Windows encoding? If so, what you have is fine.

Yes, this is Windows-specific because other platforms uses UTF-8. Rather, NONASCII broke non-Windows English builds when it did *not* break non-English builds in the past, so I had to make it Windows-specific.
Comment on attachment 8925217 [details]
Bug 1414060 - Give json.dump the correct encoding.

https://reviewboard.mozilla.org/r/196438/#review202580
Attachment #8925217 - Flags: review+
(In reply to Masatoshi Kimura [:emk] from comment #20)
> NONASCII actually breaks English CI. See comment #5. Please do not remove it
> just because it does not catch all bugs in the universe.

I think glandium's recommendation was because it was not going to do anything useful at the time I wrote the patch, but clearly the changes in between have shown that to not be true. Again, I'm very sorry we broke your configuration and I would like us to find a way to do better here.
OK. So if we land these patches then I should be able to safely re-land bug 1384557?
These patches should be to land on top of bug 1384557, otherwise they will conflict.
I recommend you to merge these patches with patches in bug 1384557 to prevent bisect from hitting a broken changeset.

Also, please land patches early. I can't assure that patches will not break after 2 months.
Product: Core → Firefox Build System
I would like to get bug 1384557 relanded, and I think the patches in this bug
can legitmately be landed separately.  I am going to carry over mshal's r+ on
both of them.  Kimura-san, would you please test these to ensure that these
patches work correctly in a non-English environment?
Attachment #9007903 - Flags: review+
Attachment #9007903 - Flags: feedback?(VYV03354)
Attachment #8925216 - Attachment is obsolete: true
Kimura-san, would you please test this as well?  Thank you!
Attachment #9007904 - Flags: review+
Attachment #8925217 - Attachment is obsolete: true
Comment on attachment 9007903 [details] [diff] [review]
move NONASCII to moz.configure

Thank you. This (and attacment 9007904) is working correctly.
Attachment #9007903 - Flags: feedback?(VYV03354) → feedback+
Pushed by nfroyd@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5ba9ea66286d
move NONASCII to moz.configure; r=mshal
https://hg.mozilla.org/integration/mozilla-inbound/rev/be0107db18f7
give json.dump the correct encoding; r=mshal
https://hg.mozilla.org/mozilla-central/rev/5ba9ea66286d
https://hg.mozilla.org/mozilla-central/rev/be0107db18f7
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
You need to log in before you can comment on or make changes to this bug.