Closed Bug 1296503 Opened 8 years ago Closed 8 years ago

Dump config.status with unicode or byte strings, matching what is in the config we get out of python configure

Categories

(Firefox Build System :: General, defect)

defect
Not set
normal

Tracking

(firefox51 fixed)

RESOLVED FIXED
mozilla51
Tracking Status
firefox51 --- fixed

People

(Reporter: glandium, Assigned: glandium)

References

(Blocks 2 open bugs)

Details

Attachments

(3 files)

We're currently using json.dump, which dumps everything as unicode strings, but we encode them, and when config.status is read, they're all byte strings. We should change things such that what we get out of running config.status (or including it from other things) gets us the same data types for strings as what we got out of python configure in the first place.
Blocks: 1296504
Blocks: 1296508
Summary: Dump config.status with unicode or byte strings, fitting what is in the config we get out of python configure → Dump config.status with unicode or byte strings, matching what is in the config we get out of python configure
Comment on attachment 8782768 [details]
Bug 1296503 - Add an indented_repr function to mozbuild.util.

https://reviewboard.mozilla.org/r/72822/#review70704

::: python/mozbuild/mozbuild/util.py:1221
(Diff revision 2)
> +                for d in recurse_indented_repr(v, level + 1):
> +                    yield d
> +                yield ',\n'
> +            yield one_indent * level
> +            yield '}'
> +        elif isinstance(o, str):

Just FYI, the version I posted in that gist was tested on Python 2 and Python 3, and there are definitely some differences that would break things. I know we'd need a lot more work to be Python 3 compatible, but it might be worthwhile to make this code future-proof.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #6)
> Comment on attachment 8782768 [details]
> Bug 1296503 - Add an indented_repr function to mozbuild.util.
> 
> https://reviewboard.mozilla.org/r/72822/#review70704
> 
> ::: python/mozbuild/mozbuild/util.py:1221
> (Diff revision 2)
> > +                for d in recurse_indented_repr(v, level + 1):
> > +                    yield d
> > +                yield ',\n'
> > +            yield one_indent * level
> > +            yield '}'
> > +        elif isinstance(o, str):
> 
> Just FYI, the version I posted in that gist was tested on Python 2 and
> Python 3, and there are definitely some differences that would break things.
> I know we'd need a lot more work to be Python 3 compatible, but it might be
> worthwhile to make this code future-proof.

We have a *lot* of work to be Python 3 compatible, most related to... strings. That said, if by future-proof, you only mean "replace str with bytes", that's fine, but I won't go further.
There were basically just two things from the gist:
https://gist.github.com/luser/a9bab41bd9102cb4055429c5c79a2a11

```
utype = type(u'')
str_is_bytes = str is bytes
```

There's no `unicode` type in Python 3, it's just `str`, so the former is used as `isinstance(s, utype)`. The latter is to determine whether when something `isinstance(s, bytes)` if that means it needs a `b` literal, although that might not matter for your use.
OTOH, the same file already uses unicode, so that's something that would need fixing separately anyways. (also, I think it would be simpler to settle on bytes and unicode, where unicode would be defined in python 3 where it doesn't exist (and we can set it on __builtins__ in the virtualenv site.py or equivalent))
Comment on attachment 8782816 [details]
Bug 1296503 - Bonus: remove work around json.dump() mis-serialization of OptionValues.

https://reviewboard.mozilla.org/r/72860/#review71488
Attachment #8782816 - Flags: review?(ted) → review+
Comment on attachment 8782768 [details]
Bug 1296503 - Add an indented_repr function to mozbuild.util.

https://reviewboard.mozilla.org/r/72822/#review71494

::: python/mozbuild/mozbuild/test/test_util.py:913
(Diff revision 3)
> +                1,
> +                b'2',
> +                '3',
> +            ],
> +            'pile_of_bytes': b'\xf0\x9f\x92\xa9',
> +            'pile_of_poo': '
Comment on attachment 8782768 [details]
Bug 1296503 - Add an indented_repr function to mozbuild.util.

https://reviewboard.mozilla.org/r/72822/#review71500
Attachment #8782768 - Flags: review?(ted) → review+
Comment on attachment 8782769 [details]
Bug 1296503 - Switch config.status to unicode literals.

https://reviewboard.mozilla.org/r/72824/#review71514

::: configure.py:68
(Diff revision 3)
>      encoding = 'mbcs' if sys.platform == 'win32' else 'utf-8'
>      with codecs.open('config.status', 'w', encoding) as fh:
>          fh.write('#!%s\n' % config['PYTHON'])
>          fh.write('# coding=%s\n' % encoding)
> -        # Because we're serializing as JSON but reading as python, the values
> -        # for True, False and None are true, false and null, which don't exist.
> +        fh.write('from __future__ import unicode_literals\n')
> +        fh.write('from mozbuild.util import encode\n')

Any reason not to write this all one in one triple-quoted string?

::: configure.py:70
(Diff revision 3)
>          fh.write('#!%s\n' % config['PYTHON'])
>          fh.write('# coding=%s\n' % encoding)
> -        # Because we're serializing as JSON but reading as python, the values
> -        # for True, False and None are true, false and null, which don't exist.
> -        # Define them.
> -        fh.write('true, false, null = True, False, None\n')
> +        fh.write('from __future__ import unicode_literals\n')
> +        fh.write('from mozbuild.util import encode\n')
> +        # A lot of the build backend code is currently expecting byte
> +        # strings and breaks in subtle ways with unicode strings.

Is there a followup bug filed on this? If not, could you file one? Either way, it'd be good to have the bug number here.

::: configure.py:99
(Diff revision 3)
> +        # break the build, as well as making it inconsistent with re-running
> +        # config.status. Fortunately, EnumString derives from unicode, so it's
> +        # covered by converting unicode strings.
> +
> +        # A lot of the build backend code is currently expecting byte strings
> +        # and breaks in subtle ways with unicode strings.

It'd be great to get to a place where we are using nice types through the whole build system until we get to spitting out Makefiles.
Attachment #8782769 - Flags: review?(ted) → review+
Pushed by mh@glandium.org:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2dd83ba65714
Add an indented_repr function to mozbuild.util. r=ted
https://hg.mozilla.org/integration/mozilla-inbound/rev/ef5d7142aed0
Switch config.status to unicode literals. r=ted
https://hg.mozilla.org/integration/mozilla-inbound/rev/95b9f4f55e5b
Bonus: remove work around json.dump() mis-serialization of OptionValues. r=ted
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: