bugzilla.mozilla.org will be intermittently unavailable on Saturday, March 24th, from 16:00 until 20:00 UTC.

ASCII vs unicode error in mozconfig.py _parse_loader_output()

RESOLVED FIXED in mozilla21


Firefox Build System
5 years ago
20 days ago


(Reporter: Jeff Hammel, Assigned: Jeff Hammel)



Firefox Tracking Flags

(Not tracked)



(1 attachment, 1 obsolete attachment)



5 years ago
From `make check`:

TEST-UNEXPECTED-FAIL | /home/jhammel/mozilla/src/mozilla-central/python/mozbuild/mozbuild/test/test_base.py | line 38, test_objdir_config_guess: 
ERROR: test_objdir_config_guess (__main__.TestMozbuildObject)
Traceback (most recent call last):
  File "/home/jhammel/mozilla/src/mozilla-central/python/mozbuild/mozbuild/test/test_base.py", line 38, in test_objdir_config_guess
  File "/home/jhammel/mozilla/src/mozilla-central/python/mozbuild/mozbuild/base.py", line 56, in topobjdir
    if self.mozconfig['topobjdir'] is None:
  File "/home/jhammel/mozilla/src/mozilla-central/python/mozbuild/mozbuild/base.py", line 71, in mozconfig
    self._mozconfig = loader.read_mozconfig()
  File "/home/jhammel/mozilla/src/mozilla-central/python/mozbuild/mozbuild/mozconfig.py", line 200, in read_mozconfig
    parsed = self._parse_loader_output(output)
  File "/home/jhammel/mozilla/src/mozilla-central/python/mozbuild/mozbuild/mozconfig.py", line 273, in _parse_loader_output
    if line.startswith('------BEGIN_'):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 5: ordinal not in range(128)

Looking more closely:

> /home/jhammel/mozilla/src/mozilla-central/python/mozbuild/mozbuild/mozconfig.py(278)_parse_loader_output()
-> if line.startswith('------BEGIN_'):
(Pdb) line
(Pdb) print line

Comment 1

5 years ago
Created attachment 702910 [details] [diff] [review]
simple fix
Attachment #702910 - Flags: review?(gps)

Comment 2

5 years ago
Comment on attachment 702910 [details] [diff] [review]
simple fix

Review of attachment 702910 [details] [diff] [review]:

Let's think about this some more.

First, let's come up with a unit test that emits your non-ascii byte sequence.

Next, let's consider our options.

If we assume the output is UTF-8, that may break mozconfigs that have non UTF-8 variable values. e.g. you set the application name to something funky. We could simply declare that things defined in mozconfigs must be UTF-8. Would that be acceptable? I'm not sure.

But, enforcing UTF-8 is only valid on things defined in the mozconfigs themselves. Your repro case has non UTF-8 in an environment variable not coming from mozconfigs. What should we do about these? It's tempting to leave environment variables as raw bytes...

The underlying problem you are seeing is due to Python 2's behavior of implicitly converting between str and unicode. I'm pretty sure this code will flat out fail on Python 3 because this implicit conversion is not allowed! I think part of our solution should be to change the string literals to b'' to prevent this implicit conversion. If we have values we want to convert to unicode, we should selectively .encode('utf-8') as appropriate.

Ugh. Re-reading this review it came out as convoluted. Unicode and Python makes my head hurt.

::: python/mozbuild/mozbuild/mozconfig.py
@@ +272,5 @@
> +            try:
> +                line = line.decode('utf8')
> +            except UnicodeDecodeError:
> +                pass

If we wanted to solve it this way, I'd just:

    line = line.decode('utf-8', 'replace')


    line = line.decode('utf-8', 'ignore')
Attachment #702910 - Flags: review?(gps)

Comment 3

5 years ago
I'm fine with either of these alternatives and don't particularly know any wonderful way of doing this across the board.  Indeed, I could have set my PS1 to some iso-9610 or whatever pageset and that would cause problems here.  I mostly want to be able to run make check without failures right now.

Comment 4

5 years ago
Created attachment 705994 [details] [diff] [review]
hacky workaround

Yep, its a horrible hack.  That said, it allows me to run `make check`
Attachment #702910 - Attachment is obsolete: true
Attachment #705994 - Flags: review?(gps)

Comment 5

5 years ago
Comment on attachment 705994 [details] [diff] [review]
hacky workaround

Please add a comment along the lines of "This is an ugly hack. Data may be lost from things like environment variable values."

Also, since this changes line from str to unicode, it will cause implicit type conversion in the code below. This will break Python 3 compatibility. Please file a follow-up to deal with this.
Attachment #705994 - Flags: review?(gps) → review+
Assignee: nobody → jhammel
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21


20 days ago
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.