Closed Bug 831381 Opened 11 years ago Closed 11 years ago

ASCII vs unicode error in mozconfig.py _parse_loader_output()

Categories

(Firefox Build System :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla21

People

(Reporter: k0scist, Assigned: k0scist)

References

Details

Attachments

(1 file, 1 obsolete file)

From `make check`:

TEST-UNEXPECTED-FAIL | /home/jhammel/mozilla/src/mozilla-central/python/mozbuild/mozbuild/test/test_base.py | line 38, test_objdir_config_guess: 
ERROR: test_objdir_config_guess (__main__.TestMozbuildObject)
Traceback (most recent call last):
  File "/home/jhammel/mozilla/src/mozilla-central/python/mozbuild/mozbuild/test/test_base.py", line 38, in test_objdir_config_guess
    self.assertIsNotNone(base.topobjdir)
  File "/home/jhammel/mozilla/src/mozilla-central/python/mozbuild/mozbuild/base.py", line 56, in topobjdir
    if self.mozconfig['topobjdir'] is None:
  File "/home/jhammel/mozilla/src/mozilla-central/python/mozbuild/mozbuild/base.py", line 71, in mozconfig
    self._mozconfig = loader.read_mozconfig()
  File "/home/jhammel/mozilla/src/mozilla-central/python/mozbuild/mozbuild/mozconfig.py", line 200, in read_mozconfig
    parsed = self._parse_loader_output(output)
  File "/home/jhammel/mozilla/src/mozilla-central/python/mozbuild/mozbuild/mozconfig.py", line 273, in _parse_loader_output
    if line.startswith('------BEGIN_'):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 5: ordinal not in range(128)

Looking more closely:

> /home/jhammel/mozilla/src/mozilla-central/python/mozbuild/mozbuild/mozconfig.py(278)_parse_loader_output()
-> if line.startswith('------BEGIN_'):
(Pdb) line
"PS1='\xe2\x94\x82'"
(Pdb) print line
PS1='│'
Attached patch simple fix (obsolete) — Splinter Review
Attachment #702910 - Flags: review?(gps)
Comment on attachment 702910 [details] [diff] [review]
simple fix

Review of attachment 702910 [details] [diff] [review]:
-----------------------------------------------------------------

Let's think about this some more.

First, let's come up with a unit test that emits your non-ascii byte sequence.

Next, let's consider our options.

If we assume the output is UTF-8, that may break mozconfigs that have non UTF-8 variable values. e.g. you set the application name to something funky. We could simply declare that things defined in mozconfigs must be UTF-8. Would that be acceptable? I'm not sure.

But, enforcing UTF-8 is only valid on things defined in the mozconfigs themselves. Your repro case has non UTF-8 in an environment variable not coming from mozconfigs. What should we do about these? It's tempting to leave environment variables as raw bytes...

The underlying problem you are seeing is due to Python 2's behavior of implicitly converting between str and unicode. I'm pretty sure this code will flat out fail on Python 3 because this implicit conversion is not allowed! I think part of our solution should be to change the string literals to b'' to prevent this implicit conversion. If we have values we want to convert to unicode, we should selectively .encode('utf-8') as appropriate.

Ugh. Re-reading this review it came out as convoluted. Unicode and Python makes my head hurt.

::: python/mozbuild/mozbuild/mozconfig.py
@@ +272,5 @@
>  
> +            try:
> +                line = line.decode('utf8')
> +            except UnicodeDecodeError:
> +                pass

If we wanted to solve it this way, I'd just:

    line = line.decode('utf-8', 'replace')

or

    line = line.decode('utf-8', 'ignore')
Attachment #702910 - Flags: review?(gps)
I'm fine with either of these alternatives and don't particularly know any wonderful way of doing this across the board.  Indeed, I could have set my PS1 to some iso-9610 or whatever pageset and that would cause problems here.  I mostly want to be able to run make check without failures right now.
Attached patch hacky workaroundSplinter Review
Yep, its a horrible hack.  That said, it allows me to run `make check`
Attachment #702910 - Attachment is obsolete: true
Attachment #705994 - Flags: review?(gps)
Comment on attachment 705994 [details] [diff] [review]
hacky workaround

Please add a comment along the lines of "This is an ugly hack. Data may be lost from things like environment variable values."

Also, since this changes line from str to unicode, it will cause implicit type conversion in the code below. This will break Python 3 compatibility. Please file a follow-up to deal with this.
Attachment #705994 - Flags: review?(gps) → review+
https://hg.mozilla.org/mozilla-central/rev/8a3e1130be71
Assignee: nobody → jhammel
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: