Handle Unicode in config.status
Categories
(Firefox Build System :: General, defect)
Tracking
(firefox70 fixed)
Tracking | Status | |
---|---|---|
firefox70 | --- | fixed |
People
(Reporter: gps, Assigned: glandium)
References
Details
Attachments
(4 files)
As described in bug 784841 comment #265 and later, our Unicode story for config.status is lacking. Notably, Python 3 compatibility is going to be very challenging since we have a lot of code that works in Python 3 mode where strings are Unicode and a lot of (mostly older) code that uses str/bytes. We should decide whether we want config.status dealing with bytes or Unicode then go from there. There is also a hybrid solution where we could add/modify autoconf macros to state the value is explicitly bytes or Unicode. From there, we could normalize to UTF-8 or something. If we decide config.status is to be bytes only, we should change config.status.m4 to emit b'' strings so Python 3 works as expected. This is kind of a wavy hands bug. I don't expect movement on it until we get serious about Python 3 compatibility.
Comment 1•10 years ago
|
||
I would love to transition configure to a Python script at some point, which would let us be more explicit about things. We could probably do it piecemeal by replacing just some autoconf macros with Python invocations.
Assignee | ||
Comment 2•10 years ago
|
||
I actually started working on a configure replacement in python, and I'm already thinking about the possibility of an incremental transition. I think we're likely to switch away from autoconf before we ever switch to python 3, so maybe WONTFIX?
Updated•5 years ago
|
Comment 3•4 years ago
|
||
Didn't know this bug existed, but the problem is getting fixed as part of bug 1473498.
(In reply to Mike Hommey [:glandium] from comment #2)
I think we're likely to switch away from autoconf before we ever switch to
python 3, so maybe WONTFIX?
:)
Assignee | ||
Comment 4•4 years ago
|
||
Actually, this is not a dupe. This was filed before python configure, and was about config.status before it was python. Which it is not anymore.
Assignee | ||
Comment 5•4 years ago
|
||
Re-actually, there's a reference to this bug in configenvironment.py, so let's make this about killing the manual decoding in there.
Assignee | ||
Comment 6•4 years ago
|
||
The configure sandbox has wrapped subprocess methods to do it since
bug 1520394.
OTOH, and while we're here, none was actually using the right encoding
for this conversion, so fix the configure sandbox to use the right one,
and make it stop using encode(), which does deep recursion that is not
necessary here, and that I'm trying to remove entirely.
Also while here, remove unused import of encode().
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Assignee | ||
Comment 7•4 years ago
|
||
With bug 1575135, we ensure nothing gets out of the configure sandbox
as a bytes string. We can thus now avoid the encode() pass in
configure.py. We still need, however, to normalize the configuration
so that it doesn't contain unexpected types, and conformning to what
indented_repr does to the configuration in config.status.
While here, convert some obj.iteritems() to six.iteritems(obj).
And remove the now unused encode function.
Assignee | ||
Comment 8•4 years ago
|
||
Now that the configuration comes in without bytes strings, there is no
need to convert it anymore.
Assignee | ||
Comment 9•4 years ago
|
||
Now that there is no bytes strings in it, we don't need to store
config.status in the system encoding to keep those valid. Moreover, the
system encoding is lossy, which utf-8 is not.
Comment 10•4 years ago
|
||
Pushed by mh@glandium.org: https://hg.mozilla.org/integration/autoland/rev/f47e966507eb Don't encode environment in get_cmd_output and old_configure. r=nalexander https://hg.mozilla.org/integration/autoland/rev/5d316c7942ee Don't encode the configuration that configure passes to config_status. r=nalexander https://hg.mozilla.org/integration/autoland/rev/a07ecf63a897 Don't create a separate unicode version of the build config. r=nalexander https://hg.mozilla.org/integration/autoland/rev/8beaed1849e8 Always encode config.status as utf-8. r=nalexander
Comment 11•4 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/f47e966507eb
https://hg.mozilla.org/mozilla-central/rev/5d316c7942ee
https://hg.mozilla.org/mozilla-central/rev/a07ecf63a897
https://hg.mozilla.org/mozilla-central/rev/8beaed1849e8
Description
•