"UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 194: ordinal not in range(128)" when running mochitests

RESOLVED FIXED in Firefox 33

Status

Testing
Mochitest Chrome
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: fitzgen, Assigned: chmanchester)

Tracking

(Blocks: 1 bug)

unspecified
mozilla34
x86
Mac OS X
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(firefox33 fixed, firefox34 fixed)

Details

Attachments

(1 attachment, 3 obsolete attachments)

In the last week or so, something changed in mochitest and I can't run the debugger's tests anymore. They seem to be fine on TBPL, but I can't successfully run the suite without the following error being printed and then the process hanging. This is with no patches applied or anything.

Needless to say, this is very disruptive to my workflow! :(

$ ./mach mochitest-devtools browser/devtools/debugger/test > ~/scratch/test.log 
Exception in thread Thread-12:
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 808, in __bootstrap_inner
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 761, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/Users/fitzgen/src/spidermonkey/testing/mozbase/mozprocess/mozprocess/processhandler.py", line 774, in _processOutput
    self.processOutputLine(line.rstrip())
  File "/Users/fitzgen/src/spidermonkey/testing/mozbase/mozprocess/mozprocess/processhandler.py", line 719, in processOutputLine
    handler(line)
  File "/Users/fitzgen/src/spidermonkey/obj.noindex/_tests/testing/mochitest/runtests.py", line 1809, in processOutputLine
    self.harness.message_logger.process_message(message)
  File "/Users/fitzgen/src/spidermonkey/obj.noindex/_tests/testing/mochitest/runtests.py", line 146, in process_message
    self.logger.log_raw(message)
  File "/Users/fitzgen/src/spidermonkey/testing/mozbase/mozlog/mozlog/structured/structuredlog.py", line 122, in log_raw
    self._handle_log(data)
  File "/Users/fitzgen/src/spidermonkey/testing/mozbase/mozlog/mozlog/structured/structuredlog.py", line 134, in _handle_log
    handler(data)
  File "/Users/fitzgen/src/spidermonkey/testing/mozbase/mozlog/mozlog/structured/handlers/__init__.py", line 66, in __call__
    self.stream.write(formatted.encode("utf8", "replace"))
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 351, in write
    data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 194: ordinal not in range(128)
Similar to bug 1040966, but not the same. This is erroring actually writing the structured log. This is a regression from bug 886570.
Blocks: 886570
This is the same underlying bug I think, but wasn't fixed by your patch because it only fixed the input side.

The problem here is that we aren't clearly enforcing the rule that all string data that gets passed to the formatters is unicode, or that all data returned from them is unicode. If we pass in a byte string, and the formatter doesn't explicitly coerce to unicode we will get a bytestring back and when that contains non-unicode we can't encode it to utf8 because encode on a string first tries to decode as ascii and then reencode as whatever the target encoding is.

I have a plan to fix this in the structured logger by explicitly coercing all data to the right types, but I can't fix it until the morning (BST).
I made a start at fixing this in [1]. The idea is that all mozlog parameters are explicitly given types and we coerce to that type or fail. That means the data in the output object is all of known type (except in the case of deliberately unregulated "extras" data).

There are a few known problems:

 * Tests with tuple ids may not convert properly
 * dicts, notably run_info are not recursively typechecked
 * log_raw blows a huge hole in the safety as no typechecking / conversion is done.

All of these are solvable; the first should be simple. The second probably wants some syntax like
Dict("run_info", {"platform": Unicode, "bits": Int, [...]}). The third might require some refactoring of the code, but it might work if log_raw isn't really so raw but given a message like {"action":"test_start", "test":"test_id", "thread":"something"} we can pass the relevant arguments to the wrapped we create for StructuredLogger.test_start() then update the result with any other arguments. Since the wrappers are objects, we can just create a registry mapping function names to wrappers and generalise them a bit so they can either call the wrapped function or just return the data that they would have called it with.

[1] https://github.com/jgraham/gecko-dev/blob/ed26c2aba1db16f8b12d0a647dc3db91e948a704/testing/mozbase/mozlog/mozlog/structured/structuredlog.py
Pushed some changes to the branch to deal with tuples and log_raw. Dictionaries still need to be handled.

Chris or Ahmed: do you have time to work on this, add some tests and push it through? I need to spend some time on other things.
(Assignee)

Comment 5

3 years ago
(In reply to James Graham [:jgraham] from comment #4)
> Pushed some changes to the branch to deal with tuples and log_raw.
> Dictionaries still need to be handled.
> 
> Chris or Ahmed: do you have time to work on this, add some tests and push it
> through? I need to spend some time on other things.

I've been looking at this, and with only a few changes this looks like it will fix this regression avoid this issue for other harnesses I've been working on. The approach seems a pretty heavyweight, but I'm ok with it.

I'll ask for review once try looks ok and I can confirm locally this subsumes the fix in bug 1040966.
One problem with this aporoach is that it might be relatively slow. It would be good to get some benchmarks to see if it makes a big difference in a situation where we are also e.g. writing formatted and raw output to streams. If it is problematically slower we may need to climb down the abstraction ladder a little (but obviously benchmarking makes sense).
Blocks: 947044
(Assignee)

Comment 7

3 years ago
https://tbpl.mozilla.org/?tree=Try&rev=e56fe19112d3
(Assignee)

Comment 8

3 years ago
(In reply to James Graham [:jgraham] from comment #6)
> One problem with this aporoach is that it might be relatively slow. It would
> be good to get some benchmarks to see if it makes a big difference in a
> situation where we are also e.g. writing formatted and raw output to
> streams. If it is problematically slower we may need to climb down the
> abstraction ladder a little (but obviously benchmarking makes sense).

Comparing central to try runs like these (the mochitest jobs among them):

https://tbpl.mozilla.org/?tree=Try&rev=31b50c19350c
https://tbpl.mozilla.org/?tree=Try&rev=a340175e39c7

any performance penalty due to this patch is within usual fluctuations in run times for automation jobs.

I ran a script that just logs a bunch of test messages under cProfile, and the difference between runtime with and without the patch was pretty inconclusive. I can post the script if anyone is interested, but I'm pretty convinced this patch doesn't introduce a bottleneck.
(Assignee)

Comment 9

3 years ago
Created attachment 8464033 [details] [diff] [review]
Explicitly type-convert all input to mozlog.structured

This is jgraham's branch slightly rebased.
Attachment #8464033 - Flags: review?(ahalberstadt)
(Assignee)

Updated

3 years ago
Assignee: nobody → cmanchester
Status: NEW → ASSIGNED
(Assignee)

Comment 10

3 years ago
Created attachment 8464036 [details] [diff] [review]
Additional test and minor modification to mochitest's javascript logger

These are some additional tests and some small modifications this needed.

If we end up thinking this approach is overkill (I'm on the fence), I think the conversion routine for Unicode is the only thing in here that is solving an urgent problem we're having, so we could pull that out and use it in a simpler fix.
Attachment #8464036 - Flags: review?(ahalberstadt)
Attachment #8464036 - Flags: feedback?(james)
Comment on attachment 8464033 [details] [diff] [review]
Explicitly type-convert all input to mozlog.structured

Review of attachment 8464033 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good, r+ with the review comments addressed.

Before landing this I think it would be good to get some benchmarks with and without it though. I'll take a look at the other patch tomorrow.

::: testing/mozbase/mozlog/mozlog/structured/structuredlog.py
@@ +9,5 @@
>  from threading import current_thread, Lock
>  import json
>  import time
>  
> +from types import Unicode, TestId, Status, SubStatus, Dict, List, log_action, convertor_registry

Rename the module to something else, 'types' is also a system library. Maybe 'tokens' or 'lexeme' (if you want to get technical)?

::: testing/mozbase/mozlog/mozlog/structured/types.py
@@ +1,1 @@
> +convertor_registry = {}

Missing MPL header

@@ +74,5 @@
> +
> +        return data
> +
> +    def convert_known(self, **kwargs):
> +        print kwargs.keys()

Leftover debug statement?

@@ +80,5 @@
> +                        if name in self.args}
> +        return self.convert(**known_kwargs)
> +
> +missing = object()
> +no_default = object()

Define these at the top of the file.

@@ +101,5 @@
> +
> +        try:
> +            return self.convert(value)
> +        except:
> +            import traceback

import at the top

@@ +102,5 @@
> +        try:
> +            return self.convert(value)
> +        except:
> +            import traceback
> +            print traceback.format_exc()

nit: traceback.print_exc()
Attachment #8464033 - Flags: review?(ahalberstadt) → review+
Comment on attachment 8464036 [details] [diff] [review]
Additional test and minor modification to mochitest's javascript logger

Review of attachment 8464036 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/mochitest/tests/SimpleTest/TestRunner.js
@@ -242,5 @@
>          }
>  
>          var allData = {action: action,
>                         time: new Date().getTime(),
> -                       thread: null,

I think it makes more sense to default this to the empty string and allow pid to be null per the comment I'm about to make.

::: testing/mozbase/mozlog/mozlog/structured/handlers/__init__.py
@@ +64,2 @@
>              try:
> +                self.stream.write(formatted)

If we have a unicode object we should encode as utf8, always. For ascii strings this is equivalent anyway. A more reasonable approach here might be:

if isinstance(formatted, unicode):
    self.stream.write(formatted.encode("utf8", "replace"))
else:
    self.stream.write(formatted)

::: testing/mozbase/mozlog/mozlog/structured/logging_types.py
@@ +12,5 @@
> +        self.default_args = [
> +            Unicode("action"),
> +            Int("time"),
> +            Unicode("thread"),
> +            Int("pid"),

If you make this Int("pid", default=None) then we can set it to null when we don't know.

The other option is to make a Nullable type something like

class Nullable(DataType):
    def __init__(self, name, inner_type, default=no_default, optional=False):
        self.inner = inner_type(name, default=default, optional=optional)
        DataType.__init__(self, name, default, optional)

    def convert(self, value):
        if value is None:
             return None
        return self.inner.convert(value)

@@ +101,5 @@
> +        try:
> +            return self.convert(value)
> +        except:
> +            import traceback
> +            print traceback.format_exc()

This seems unnecessary.

@@ +108,5 @@
> +
> +class Unicode(DataType):
> +    def convert(self, data):
> +        try:
> +            rv = unicode(data)

Shouldn't this bit be

if isinstance(data, unicode):
    return data
if isinstance(data, str):
    return data.decode("utf8", "replace")
return unicode(data)

::: testing/mozbase/mozlog/mozlog/structured/structuredlog.py
@@ +8,5 @@
>  from threading import current_thread, Lock
>  import json
>  import time
>  
> +from logging_types import Unicode, TestId, Status, SubStatus, Dict, List, log_action, convertor_registry

I think module names with _ in violate PEP8, but you don't *have* to change it.

@@ +141,5 @@
>                  converted_data[k] = v
>  
>          data = self._make_log_data(action, converted_data)
> +
> +        if "expected" in data and "status" in data:

This should probably whitelist certain actions.

::: testing/mozbase/mozlog/tests/test_structured.py
@@ +344,5 @@
> +        self.logger.info(1)
> +        self.assert_log_equals({"action": "log",
> +                                "message": "1",
> +                                "level": "INFO"})
> +        self.logger.info([1, (2, '3'), "s", "s" + chr(255)])

Why not write chr(255) as \xFF?

@@ +350,5 @@
> +                                "message": "[1, (2, '3'), 's', 's\\xff']",
> +                                "level": "INFO"})
> +        self.logger.suite_end()
> +
> +    def test_unicode_write(self):

This isn't really testing "unicode" though. It's testing str containing utf8 bytes.
Attachment #8464036 - Flags: feedback?(james) → feedback+
(Assignee)

Comment 13

3 years ago
(In reply to James Graham [:jgraham] from comment #12)
> Comment on attachment 8464036 [details] [diff] [review]
> Additional test and minor modification to mochitest's javascript logger
> 
> Review of attachment 8464036 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: testing/mochitest/tests/SimpleTest/TestRunner.js
> @@ -242,5 @@
> >          }
> >  
> >          var allData = {action: action,
> >                         time: new Date().getTime(),
> > -                       thread: null,
> 
> I think it makes more sense to default this to the empty string and allow
> pid to be null per the comment I'm about to make.
> 
> ::: testing/mozbase/mozlog/mozlog/structured/handlers/__init__.py
> @@ +64,2 @@
> >              try:
> > +                self.stream.write(formatted)
> 
> If we have a unicode object we should encode as utf8, always. For ascii
> strings this is equivalent anyway. A more reasonable approach here might be:
> 
> if isinstance(formatted, unicode):
>     self.stream.write(formatted.encode("utf8", "replace"))
> else:
>     self.stream.write(formatted)

As confounding as it is, this fails when redirecting stdout to a file as in the bug report (but not when printing to the terminal). stdlib logging does a similar "try: ... except UnicodeError: ...".

Unless someone additional insight on how to fix this properly, I'm inclined to go with something like that.
(Assignee)

Comment 14

3 years ago
Created attachment 8464852 [details] [diff] [review]
Explicitly type-convert all input to mozlog.structured.

This is updated with review comments. Running on try here: https://tbpl.mozilla.org/?tree=Try&rev=09681df193a0

To reproduce the issue mentioned in comment 13, apply this patch and run |mach mochitest-devtools browser/devtools/debugger/test/browser_dbg_search-global-06.js > ~/scratch/test.log|
(Assignee)

Updated

3 years ago
Attachment #8464033 - Attachment is obsolete: true
Sorry, I didn't realize this was a patch based on the other patch. Chris, do you have the interdiff with just the changes you made on top of jgraham's branch by any chance?
Flags: needinfo?(cmanchester)
(Assignee)

Comment 16

3 years ago
(In reply to Andrew Halberstadt [:ahal] from comment #15)
> Sorry, I didn't realize this was a patch based on the other patch. Chris, do
> you have the interdiff with just the changes you made on top of jgraham's
> branch by any chance?

The attachment in comment 10 is just my changes to jgraham's branch. Comment 14 has modifications in response to both reviews.
Flags: needinfo?(cmanchester)
(In reply to Chris Manchester [:chmanchester] from comment #16)
> (In reply to Andrew Halberstadt [:ahal] from comment #15)
> > Sorry, I didn't realize this was a patch based on the other patch. Chris, do
> > you have the interdiff with just the changes you made on top of jgraham's
> > branch by any chance?
> 
> The attachment in comment 10 is just my changes to jgraham's branch. Comment
> 14 has modifications in response to both reviews.

Oh I see, the changes in logging_types.py just got squashed together because the file moved. Makes sense!
Comment on attachment 8464036 [details] [diff] [review]
Additional test and minor modification to mochitest's javascript logger

Review of attachment 8464036 [details] [diff] [review]:
-----------------------------------------------------------------

Lgtm.

::: testing/mozbase/mozlog/mozlog/structured/logging_types.py
@@ +28,5 @@
> +
> +            self.args[arg.name] = arg
> +
> +        for extra in self.default_args:
> +            self.args[extra.name] = extra

note: Could use a dictionary comprehension here, though this probably more readable.

@@ +118,5 @@
> +    def convert(self, data):
> +        if isinstance(data, unicode):
> +            return data
> +        elif isinstance(data, str):
> +            return data.decode("utf-8", "replace")

This pattern seems to be repeated a few times. Shouldn't it be in DataType.convert() and then the child classes can wrap it if they want (rather than override it completely).

::: testing/mozbase/mozlog/mozlog/structured/structuredlog.py
@@ +8,5 @@
>  from threading import current_thread, Lock
>  import json
>  import time
>  
> +from logging_types import Unicode, TestId, Status, SubStatus, Dict, List, log_action, convertor_registry

Actually PEP8 says underscores *can* be used for modules: http://legacy.python.org/dev/peps/pep-0008/#package-and-module-names. It's root level packages where underscores are discouraged. Though this name is mildly awkward :p.
Attachment #8464036 - Flags: review?(ahalberstadt) → review+
I changed the code in StreamHandler to;

            if isinstance(formatted, unicode):
                self.stream.write(formatted.encode("utf-8", "replace"))
            elif isinstance(formatted, str):
                self.stream.write(formatted)
            else:
                assert False, "Got output from the formatter of an unexpected type"

I couldn't reproduce an error with the testsuite above.
Will this patch fix the error in bug 1046852 as well? It looks like yes, but just want to double check since the other one is happening in a different location.
Would you mind getting this patch landed sooner rather than later?

This sort of thing sitting around for over a week isn't cool; it breaks people's ability to get work done.  (For example, I can't run the layout/style mochitests under ./mach mochitest-remote on FxOS emulator.)

I think it's also not cool that you're doing substantive development on the test harness in a way that can easily break individual tests for running locally, but without breaking what's running in automation.  I think you should prioritize fixing that ahead of any other further development of structured logging, given that it has caused multiple regressions for people running tests locally (at least this bug and bug 1041546 and bug 1040966, and that's not counting bug 1043015 since it wouldn't have shown up running the test in automation).
Flags: needinfo?(cmanchester)
I forgot that Chris was going PTO and I should have taken up the slack to get this in. He will be back on Monday, but it should be possible to get some movement here in the meantime.

It looks like a little more work is needed because the current patch has turned Windows builds orange.
Actually it's a different patch that's causing that orange, so I think it's possible to land this.
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=39572509925f

I figured out that mach does something unexpected with stdout; it wraps it in a StreamWriter, which has the effect that it's unsafe to send it a str and safe to send it unicode. This is exactly the opposite compared to a normal file, so it causes issues.
Created attachment 8466822 [details] [diff] [review]
Explicitly type-convert all input to mozlog.structured.;r=ahal

Looking for a quick sanity check of this final patch (Canada are on holiday Monday
so ahal won't be around). The only change is that we try to detect the case where 
mach has "helpfully" wrapped stdout and unwrap it. I'm not sure if this is a sane
long-term solution, but it does seem to work OK in my local testing.
Attachment #8464036 - Attachment is obsolete: true
Attachment #8464852 - Attachment is obsolete: true
Attachment #8466822 - Flags: review?(cmanchester)
(Assignee)

Comment 26

3 years ago
Comment on attachment 8466822 [details] [diff] [review]
Explicitly type-convert all input to mozlog.structured.;r=ahal

Review of attachment 8466822 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you for tracking down the issue with mach and stdout! This works locally. This is ready to land pending the two notes below.

::: testing/mozbase/mozlog/mozlog/structured/handlers/__init__.py
@@ +58,5 @@
> +        # to a normal stream i.e. you always have to pass in a Unicode
> +        # object rather than a string object. Cope with that by extracting
> +        # the underlying raw stream.
> +        if isinstance(stream, codecs.StreamWriter):
> +            stream = stream.stream

Another way to deal with this would be to use sys.__stdout__, I think it might be preferable because it addresses the exact issue we're having. I don't really think it will be a problem, but it's conceivable that this will not be correct for some uses of StreamWriter.

@@ +76,5 @@
> +                self.stream.write(formatted.encode("utf-8", "replace"))
> +            elif isinstance(formatted, str):
> +                self.stream.write(formatted)
> +            else:
> +                assert False, "Got output from the formatter of an unexpected type"

Would an exception or some other fall back be better here?
Attachment #8466822 - Flags: review?(cmanchester) → review+
(Assignee)

Updated

3 years ago
Blocks: 1046852
https://hg.mozilla.org/integration/mozilla-inbound/rev/7e0f0db0bb2c

Pushed this as-is for now to unblock people; let's figure out any remaining issues in a followup bug.

> ::: testing/mozbase/mozlog/mozlog/structured/handlers/__init__.py
> @@ +58,5 @@
> > +        # to a normal stream i.e. you always have to pass in a Unicode
> > +        # object rather than a string object. Cope with that by extracting
> > +        # the underlying raw stream.
> > +        if isinstance(stream, codecs.StreamWriter):
> > +            stream = stream.stream
> 
> Another way to deal with this would be to use sys.__stdout__, I think it
> might be preferable because it addresses the exact issue we're having. I
> don't really think it will be a problem, but it's conceivable that this will
> not be correct for some uses of StreamWriter.

I think pretty much any use of StreamWriter will be incorrect here. We assume that it's always safe to write bytes to the file, but StreamWriter makes that unsafe because it always tries to convert to Unicode which can fail.

It does mean that we could end up writing invalid utf8, which isn't possible with StreamWriter. But I think this is the right tradeoff.

> @@ +76,5 @@
> > +                self.stream.write(formatted.encode("utf-8", "replace"))
> > +            elif isinstance(formatted, str):
> > +                self.stream.write(formatted)
> > +            else:
> > +                assert False, "Got output from the formatter of an unexpected type"
> 
> Would an exception or some other fall back be better here?

Possibly making a last ditch attempt to convert to a string would be better e.g.

self.stream.write(unicode(formatted).encode("utf-8"))

It depends if we want to do whatever we can to encourage people to fix their formatter to return unicode-or-None, or if we want to make it as unlikely as possible to have a runtime error.
(In reply to James Graham [:jgraham] from comment #24)
> https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=39572509925f
> 
> I figured out that mach does something unexpected with stdout; it wraps it
> in a StreamWriter, which has the effect that it's unsafe to send it a str
> and safe to send it unicode. This is exactly the opposite compared to a
> normal file, so it causes issues.

mach attempts to be Python 3 compliant, which means unicode (not str) everywhere. This choice has been a royal PITA. However, it has succeeded in identifying code that takes the ostrich approach to Unicode.
(Assignee)

Comment 29

3 years ago
(In reply to David Baron [:dbaron] (UTC-7) (needinfo? for questions) from comment #21)
> Would you mind getting this patch landed sooner rather than later?
>
> This sort of thing sitting around for over a week isn't cool; it breaks
> people's ability to get work done.  (For example, I can't run the
> layout/style mochitests under ./mach mochitest-remote on FxOS emulator.)

That lack of progress resolving these regressions has impacted people’s productivity is something I regret and take very seriously.

>
> I think it's also not cool that you're doing substantive development on the
> test harness in a way that can easily break individual tests for running
> locally, but without breaking what's running in automation.  I think you
> should prioritize fixing that ahead of any other further development of
> structured logging, given that it has caused multiple regressions for people
> running tests locally (at least this bug and bug 1041546 and bug 1040966,
> and that's not counting bug 1043015 since it wouldn't have shown up running
> the test in automation).

Until mach is running in automation and comprehensive tests are implemented, we are forced to rely to a degree on cursory checks and feedback from developers when making these kinds of changes. While it wouldn’t have found this particular issue, the harness tests discussed in bug 1045316 would be extremely helpful when modifying test harnesses.
Flags: needinfo?(cmanchester)
Depends on: 1048422
Depends on: 1048423
https://hg.mozilla.org/mozilla-central/rev/7e0f0db0bb2c
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
https://hg.mozilla.org/releases/mozilla-aurora/rev/06067769c488
status-firefox33: --- → fixed
status-firefox34: --- → fixed
You need to log in before you can comment on or make changes to this bug.