Closed Bug 944551 Opened 9 years ago Closed 9 years ago

Build fails on Japanese or Russian Windows

Categories

(Firefox Build System :: General, defect)

x86_64
Windows 8.1
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla29

People

(Reporter: emk, Assigned: gps)

References

Details

Attachments

(2 files)

0:00.72 C:/mozilla-build/msys/bin/sh.exe -c c:/mozilla-build/python/python.exe
h:/m/mozilla-central/build/pymake/make.py -f client.mk -s
 0:02.72 Adding client.mk options from h:/m/mozilla-central/.mozconfig:
 0:02.73     AUTOCLOBBER=1
 0:02.73     FOUND_MOZCONFIG := h:/m/mozilla-central/.mozconfig
 0:03.42 TEST-PASS | check-sync-dirs.py | h:\m\mozilla-central\js\src\config <=
h:\m\mozilla-central\config
 0:07.28 Build configuration changed. Regenerating backend.
 0:07.44 Traceback (most recent call last):
 0:07.44   File "config.status", line 877, in <module>
 0:07.44     config_status(**args)
 0:07.44   File "h:\m\mozilla-central\build\ConfigStatus.py", line 74, in config
_status
 0:07.44     non_global_defines=non_global_defines, substs=substs)
 0:07.44   File "h:\m\mozilla-central\python\mozbuild\mozbuild\backend\configenv
ironment.py", line 126, in __init__
 0:07.44     serialize(self.substs[name])) for name in self.substs if self.subst
s[name]]))
 0:07.44 UnicodeDecodeError: 'ascii' codec can't decode byte 0x83 in position 21
: ordinal not in range(128)
 0:07.44 h:\m\mozilla-central\obj-i686-pc-mingw32\Makefile:82:0: command 'h:/m/m
ozilla-central/obj-i686-pc-mingw32/_virtualenv/Scripts/python.exe config.status'
 failed, return code 1
 0:07.44 <Makefile>: Found error
 0:07.44 Error remaking makefiles (ignored)
 0:07.44 h:\m\mozilla-central\client.mk:400:0: command 'c:/mozilla-build/python/
python.exe h:/m/mozilla-central/build/pymake/pymake/../make.py -j4 -C obj-i686-p
c-mingw32' failed, return code 2
 0:07.44 h:\m\mozilla-central\client.mk:185:0: command 'c:/mozilla-build/python/
python.exe h:/m/mozilla-central/build/pymake/pymake/../make.py -f h:/m/mozilla-c
entral/client.mk realbuild' failed, return code 2
 0:07.65 964 compiler warnings present.
2
Reverting 9af1bfc53b44 fixed the issue.
Blocks: 944265
I can kind of reproduce, but the error is not on the same trace:
Traceback (most recent call last):
  File "./config.status", line 939, in <module>
    config_status(**args)
  File "/home/mh/mozilla-central/build/ConfigStatus.py", line 74, in config_status
    non_global_defines=non_global_defines, substs=substs)
  File "/home/mh/mozilla-central/python/mozbuild/mozbuild/backend/configenvironment.py", line 118, in __init__
    shell_quote(self.defines[name]).replace('$', '$$')) for name in global_defines])
  File "/home/mh/mozilla-central/python/mozbuild/mozbuild/util.py", line 507, in shell_quote
    return "'%s'" % s.replace("'", "'\\''")

If I remove "from __future__ import unicode_literals" from the top of python/mozbuild/mozbuild/util.py, it works.
This is probably not the right thing to do, but this works for me:

diff --git a/python/mozbuild/mozbuild/util.py b/python/mozbuild/mozbuild/util.py
--- a/python/mozbuild/mozbuild/util.py
+++ b/python/mozbuild/mozbuild/util.py
@@ -495,13 +495,13 @@ class PushbackIter(object):
 def shell_quote(s):
     '''Given a string, returns a version enclosed with single quotes for use
     in a shell command line.
 
     As a special case, if given an int, returns a string containing the int,
     not enclosed in quotes.
     '''
     if type(s) == int:
-        return '%d' % s
+        return b'%d' % s
     # Single quoted strings can contain any characters unescaped except the
     # single quote itself, which can't even be escaped, so the string needs to
     # be closed, an escaped single quote added, and reopened.
-    return "'%s'" % s.replace("'", "'\\''")
+    return b"'%s'" % s.replace(b"'", b"'\\''")
Flags: needinfo?(gps)
(I think that having half our python modules use unicode_literals and half not makes things more complicated than they need to be)
(In reply to Mike Hommey [:glandium] from comment #2)
> If I remove "from __future__ import unicode_literals" from the top of
> python/mozbuild/mozbuild/util.py, it works.

Yeah, it worked for me, too.
We have to deal with Unicode in Python sooner or later. If we defer, we're simply digging a hole until we strive for Python 3 compatibility.

Currently, config.status is b'' not u''. We do produce Unicode strings for substs for sandbox reading since moz.build files are executed with unicode_literals. Bug 844509 tracks moving everything to Unicode.

In the mean time, I think updating shell_quote to return the text type that was originally passed in is the best course of action.
Flags: needinfo?(gps)
Can Python 3 read Unicode strings from environment, pass Unicode arguments to commands, output Unicode strings to console, etc. on Windows?
Something like this?
Attachment #8340616 - Flags: review?(gps)
Assignee: nobody → mh+mozilla
Status: NEW → ASSIGNED
Comment on attachment 8340616 [details] [diff] [review]
Handle unicode in mozbuild.util.shell_quote

Review of attachment 8340616 [details] [diff] [review]:
-----------------------------------------------------------------

::: python/mozbuild/mozbuild/util.py
@@ +504,5 @@
>      # Single quoted strings can contain any characters unescaped except the
>      # single quote itself, which can't even be escaped, so the string needs to
>      # be closed, an escaped single quote added, and reopened.
> +    t = type(s)
> +    return t("'%s'") % s.replace(t("'"), t("'\\''"))

It's not that simple.

The arguments to t() - to str() or unicode() - are string literals. Since this .py file has unicode_literals, you are feeding in unicode strings. You will have implicit type conversion and that is bad because Python assumes the encoding is ASCII and the type conversion will fail on strings having non-ASCII characters - that's what this bug is about. FWIW, Python 3 doesn't have this quite silly implicit type conversion: if you attempt to coerce a unicode into binary, an exception is raised.

You need to do something like:

if isinstance(s, unicode):
   return "'%s'" % s.replace("'", "'\\''")
else:
   return b"'%s'" % s.replace(b"'", b"'\\''")

Don't forget about the "return '%d' %s" bit above.
Attachment #8340616 - Flags: review?(gps) → review-
(In reply to Gregory Szorc [:gps] from comment #9)
> Comment on attachment 8340616 [details] [diff] [review]
> Handle unicode in mozbuild.util.shell_quote
> 
> Review of attachment 8340616 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: python/mozbuild/mozbuild/util.py
> @@ +504,5 @@
> >      # Single quoted strings can contain any characters unescaped except the
> >      # single quote itself, which can't even be escaped, so the string needs to
> >      # be closed, an escaped single quote added, and reopened.
> > +    t = type(s)
> > +    return t("'%s'") % s.replace(t("'"), t("'\\''"))
> 
> It's not that simple.
> 
> The arguments to t() - to str() or unicode() - are string literals. Since
> this .py file has unicode_literals, you are feeding in unicode strings. You
> will have implicit type conversion and that is bad because Python assumes
> the encoding is ASCII and the type conversion will fail on strings having
> non-ASCII characters - that's what this bug is about. FWIW, Python 3 doesn't
> have this quite silly implicit type conversion: if you attempt to coerce a
> unicode into binary, an exception is raised.

The interesting thing is that running config.status with a japanese string in defines just worked with this patch.

> You need to do something like:
> 
> if isinstance(s, unicode):
>    return "'%s'" % s.replace("'", "'\\''")
> else:
>    return b"'%s'" % s.replace(b"'", b"'\\''")
> 
> Don't forget about the "return '%d' %s" bit above.

What should it do, since the passed value is an int, then, and doesn't have a string type to use as a hint.
Any progress?
Until when do I have to apply the workaround to wait for the bikeshedding?
Flags: needinfo?(mh+mozilla)
Comment on attachment 8340616 [details] [diff] [review]
Handle unicode in mozbuild.util.shell_quote

(In reply to Gregory Szorc [:gps] from comment #9)
> > +    t = type(s)
> > +    return t("'%s'") % s.replace(t("'"), t("'\\''"))
> 
> It's not that simple.

Actually, it *is* that simple.
 
> The arguments to t() - to str() or unicode() - are string literals. Since
> this .py file has unicode_literals, you are feeding in unicode strings. You
> will have implicit type conversion and that is bad because Python assumes
> the encoding is ASCII and the type conversion will fail on strings having
> non-ASCII characters - that's what this bug is about.

The arguments to t() are string literals, sure. The first is literally «'%s'», the second, literally «'» and the third, literally «'\''». They're not going to change, they don't depend on what's given to shell_quote as an argument. If t is str and it's passed those unicode literals, it's not going to barf, because they *are* ascii. If t is unicode, then there's no problem.

This actually ensures all string handling in this function is done with the same type as what's given as argument. So in fact, it's strictly equivalent to your alternative.

As for the %d case, it just means that when we're passed an int, we return an unicode containing [0-9]. That's not going to cause any conversion problem.
Attachment #8340616 - Flags: review- → review?(gps)
Flags: needinfo?(mh+mozilla)
Slightly more robust version of patch.

emk: please confirm this fixes your issues.
Attachment #8345068 - Flags: review?(mh+mozilla)
Attachment #8345068 - Flags: feedback?(VYV03354)
Assignee: mh+mozilla → gps
Attachment #8340616 - Attachment is obsolete: true
Attachment #8340616 - Flags: review?(gps)
Comment on attachment 8345068 [details] [diff] [review]
Fix Python Unicode conversions related to shell escaping

Worked for me.
Attachment #8345068 - Flags: feedback?(VYV03354) → feedback+
Blocks: 948405
Comment on attachment 8345068 [details] [diff] [review]
Fix Python Unicode conversions related to shell escaping

Review of attachment 8345068 [details] [diff] [review]:
-----------------------------------------------------------------

I still prefer my approach. You concern about python3 is only half relevant: by the time we get in a situation where this code can run in python3, the issue of unicode vs. str in defines will have had to be resolved anyways.
Attachment #8345068 - Flags: review?(mh+mozilla)
Is it impossible land something first to fix the critical error (at least for me), then talk all you like to improve the patch?
Duplicate of this bug: 950066
Summary: Build fails on Japanese Windows → Build fails on Japanese or Russian Windows
Comment on attachment 8345068 [details] [diff] [review]
Fix Python Unicode conversions related to shell escaping

Ted: glandium and I have a difference of opinion on this. Would you care to weigh in? You won't hurt any feelings - I'm prepared to accept glandium's simpler patch if you feel it is appropriate.
Attachment #8345068 - Flags: review?(ted)
Comment on attachment 8345068 [details] [diff] [review]
Fix Python Unicode conversions related to shell escaping

Review of attachment 8345068 [details] [diff] [review]:
-----------------------------------------------------------------

I like glandium's simpler solution. I think it'll handle the cases we care about now, and I agree that if we move to Python 3 we're likely to have to overhaul this more significantly anyway.
Attachment #8345068 - Flags: review?(ted) → review-
Comment on attachment 8340616 [details] [diff] [review]
Handle unicode in mozbuild.util.shell_quote

Land it.
Attachment #8340616 - Attachment is obsolete: false
Attachment #8340616 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/57225ef2a7b7
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.