Closed Bug 991866 Opened 10 years ago Closed 5 years ago

Unicode decode error trying to run mochitest-remote for b2g-emulator M10

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: jesup, Assigned: ahal)

References

Details

Attachments

(3 files, 4 obsolete files)

mozprocess_decode 10 years ago Andrew Halberstadt [:ahal] 853 bytes, patch		Details \| Diff \| Splinter Review
mozprocess unicode 10 years ago Andrew Halberstadt [:ahal] 4.11 KB, patch	gps : feedback+	Details \| Diff \| Splinter Review
encoding-issue.diff 10 years ago Sylvestre Ledru [:Sylvestre] 1.06 KB, patch	ahal : review-	Details \| Diff \| Splinter Review
mozprocess unicode 10 years ago Andrew Halberstadt [:ahal] 3.35 KB, patch		Details \| Diff \| Splinter Review
Attempt to decode to utf-8, then iso8859-1 10 years ago Andrew Halberstadt [:ahal] 1.33 KB, patch		Details \| Diff \| Splinter Review
Attempt to decode to utf-8, then iso8859-1 10 years ago Andrew Halberstadt [:ahal] 1.38 KB, patch	gps : review+	Details \| Diff \| Splinter Review
workaround on the utf-8 display failing 10 years ago Sylvestre Ledru [:Sylvestre] 1.19 KB, patch	ahal : review+	Details \| Diff \| Splinter Review

Randell Jesup [:jesup] (needinfo me)

Reporter

Description

•

10 years ago

./mach mochitest-remote --total-chunks 15 --this-chunk 10 
After a while, I get the following error (repeatable) and the framework stops logging.  The tests seem to continue on.

1326 INFO TEST-PASS | /tests/dom/smil/test/test_smilAccessKey.xhtml | Unexpected start time for animation with begin: accessKey(a)
1327 INFO TEST-PASS | /tests/dom/smil/test/test_smilAccessKey.xhtml | No interval created for 'accessKey(a)' with input [charCode: 97; keyCode: 66]
1328 INFO TEST-PASS | /tests/dom/smil/test/test_smilAccessKey.xhtml | Expected animations to be paused
1329 INFO TEST-PASS | /tests/dom/smil/test/test_smilAccessKey.xhtml | Unexpected exception: InvalidStateError
1330 INFO TEST-PASS | /tests/dom/smil/test/test_smilAccessKey.xhtml | Unexpected exception code: 11
1331 INFO TEST-PASS | /tests/dom/smil/test/test_smilAccessKey.xhtml | Interval unexpectedly created for 'accessKey(a)' with input [charCode: 98; keyCode: 97]
1332 INFO TEST-PASS | /tests/dom/smil/test/test_smilAccessKey.xhtml | Expected animations to be paused
[Child 299] WARNING: jschar out of char range; high bits of data lost: 0x20ac: file ../../../../js/xpconnect/src/XPCConvert.cpp, line 354
Exception in thread Thread-4:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.7/threading.py", line 764, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/home/jesup/src/mozilla/B2G-emulator/../inbound/testing/mozbase/mozprocess/mozprocess/processhandler.py", line 774, in _processOutput
    self.processOutputLine(line.rstrip())
  File "/home/jesup/src/mozilla/B2G-emulator/../inbound/testing/mozbase/mozprocess/mozprocess/processhandler.py", line 719, in processOutputLine
    handler(line)
  File "/home/jesup/src/mozilla/B2G-emulator/../inbound/testing/mozbase/mozprocess/mozprocess/processhandler.py", line 916, in __call__
    self.stream.write(line + '\n')
  File "/usr/lib64/python2.7/codecs.py", line 351, in write
    data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 128: ordinal not in range(128)

Andrew Halberstadt [:ahal]

Assignee

Comment 1

•

10 years ago

Attached patch mozprocess_decode (obsolete) — Details — Splinter Review

This should fix your immediate problem. Though I think this bug raises some questions around how we handle unicode in general throughout mozprocess. I want to spend some time testing it out more and make sure there aren't other hidden bugs like this waiting to be exposed.

Randell Jesup [:jesup] (needinfo me)

Reporter

Comment 2

•

10 years ago

This appears to work for me (got a crash on shutdown, but not due to this).
Thanks!

Gregory Szorc [:gps]

Comment 3

•

10 years ago

Comment on attachment 8401521 [details] [diff] [review]
mozprocess_decode

You want this to be "line.decode('utf-8', 'replace')"

Otherwise, if invalid UTF-8 is seen, the work will blow up again.

We dealt with this problem extensively in mach land. See https://hg.mozilla.org/mozilla-central/file/6c924a018540/python/mach/mach/mixin/process.py#l80

What I'm trying to hold the line on is that any textual data inside Python be converted to unicode (not str) at the process border. If we fail to do that, we're setting ourselves up for all kinds of str/bytes/unicode implicit conversion fail come Python 3.

You can keep things as str now. But I highly recommend you to rewrite mozprocess to be encoding aware. In this patch, you should look at self.stream.encoding and perform the encoding explicitly in __call__. If Python 2 is left to do it, it will likely do the wrong thing. Python 3 will refuse outright.

Andrew Halberstadt [:ahal]

Assignee

Comment 4

•

10 years ago

Attached patch mozprocess unicode (obsolete) — Details — Splinter Review

I think this is what we want. Not all file-like objects have an 'encoding' property (like StringIO which some of our tests use), so I had to add a check for that. In this case I think it is reasonable to leave it up to the user to make sure the object's StreamWriter handles unrecognized input properly.

Gps, mind doing the review since you've dealt with similar issues before? If not feel free to redirect to :wlach.

Assignee: nobody → ahalberstadt

Attachment #8401521 - Attachment is obsolete: true

Status: NEW → ASSIGNED

Attachment #8401905 - Flags: review?(gps)

Gregory Szorc [:gps]

Comment 5

•

10 years ago

Comment on attachment 8401905 [details] [diff] [review]
mozprocess unicode

Review of attachment 8401905 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/mozbase/mozprocess/mozprocess/processhandler.py
@@ +771,5 @@
>              (lines, self.didTimeout) = self.readWithTimeout(logsource, lineReadTimeout)
>              while lines != "":
>                  for line in lines.splitlines():
> +                    if isinstance(line, bytes):
> +                        line = line.decode(sys.stdout.encoding or 'utf-8', 'replace')

This actually scares me a bit. This is a backwards-incompatible API change and may cause other things (like mach) to start barfing. I feel this is already a house of cards and things are ready to topple.

I would perform the encoding/decoding in StreamOutput and leave this be, if possible. Later on, we can figure out a way to make mozprocess consistently emit str or unicode.

Sound like a plan?

@@ +915,5 @@
>          self.stream = stream
>  
>      def __call__(self, line):
> +        if hasattr(self.stream, 'encoding'):
> +            line = line.encode(self.stream.encoding or 'utf-8', 'replace')

Presumably you are expecting a unicode here. I may assert that.

Keep in mind the common operations for str/bytes <-> unicode are:

str.decode() -> unicode
unicode.encode() -> str

str.encode() is only really useful for the non-unicode encodings. e.g. b'foo'.encode('rot13'). See https://docs.python.org/2/library/codecs.html#standard-encodings.

I guess technically the output stream could have one of those non-unicode encodings. It's up to you. It's definitely safest to avoid the type check, especially if you won't be coercing the lines to unicode as they come off the process.

::: testing/mozbase/mozprocess/tests/procunicode.py
@@ +1,2 @@
> +# -*- coding: utf-8 -*-
> +print '字åäö'

This isn't Python 3 compatible. You'll want to do:

print(u'字åäö'.encode('utf-8'))

::: testing/mozbase/mozprocess/tests/test_mozprocess_output.py
@@ +70,5 @@
>  
> +    def test_handle_unicode(self):
> +        """
> +        Handlers don't trip up on non-ascii output
> +        """

w00t - I'm very happily to see unicode tests here!

Attachment #8401905 - Flags: review?(gps) → feedback+

Andrew Halberstadt [:ahal]

Assignee

Comment 6

•

10 years ago

So is the patch I obsoleted basically what you are suggesting we go with for now?

https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=991866&attachment=8401521

Flags: needinfo?(gps)

Gregory Szorc [:gps]

Comment 7

•

10 years ago

Yes, I think the initial patch was pretty good. It just needed an "ignore" or "replace" so invalid utf-8 byte sequences wouldn't raise an exception.

Flags: needinfo?(gps)

Sylvestre Ledru [:Sylvestre]

Comment 9

•

10 years ago

Attached patch encoding-issue.diff (obsolete) — Details — Splinter Review

Andrew, the same patch as in bug 1009490 (just updated the bug number).
I am doing an encoding with iso because .decode("utf-8") is failing with "UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position 29: invalid start byte"

Attachment #8428713 - Flags: review?(ahalberstadt)

Andrew Halberstadt [:ahal]

Assignee

Comment 10

•

10 years ago

Comment on attachment 8428713 [details] [diff] [review]
encoding-issue.diff

Review of attachment 8428713 [details] [diff] [review]:
-----------------------------------------------------------------

I don't think we want to decode to ISO-8859-1 in mozprocess here. It may fix your specific case, but it will trip up on Asian or even eastern European characters. I'll update my old obsoleted patch here and add "replace" to the call to decode which should replace the character in question with a '?' if it can't be decoded to utf-8

Attachment #8428713 - Flags: review?(ahalberstadt) → review-

Andrew Halberstadt [:ahal]

Assignee

Comment 11

•

10 years ago

Attached patch mozprocess unicode — Details — Splinter Review

Sorry, this slipped off my radar for a while there. Continuing the old discussion with :gps..

If I just do "line.decode('utf-8', 'replace')" like we talked about, the test case I created fails. Also the fact that :sylvestre is running into problems with ISO-8859-1 encoded strings makes me think maybe we should coerce everything to unicode. As per your suggestion though, I'll move it into the StreamOutput handler for now.

:sylvestre, can you verify this fixes your problem?

Attachment #8401905 - Attachment is obsolete: true

Attachment #8428874 - Flags: review?(gps)

Flags: needinfo?(sledru)

Sylvestre Ledru [:Sylvestre]

Comment 12

•

10 years ago

Same issue with your patch:
----
  Putting entry 175...    TestUniChar::TestUniChar() 175
OK...
Testing Get:
Exception in thread Thread-12:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/data/mozilla-central-coverage/testing/mozbase/mozprocess/mozprocess/processhandler.py", line 774, in _processOutput
    self.processOutputLine(line.rstrip())
  File "/data/mozilla-central-coverage/testing/mozbase/mozprocess/mozprocess/processhandler.py", line 719, in processOutputLine
    handler(line)
  File "/data/mozilla-central-coverage/testing/mozbase/mozprocess/mozprocess/processhandler.py", line 921, in __call__
    self.stream.write(line + '\n')
  File "/usr/lib/python2.7/codecs.py", line 351, in write
    data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 29: ordinal not in range(128)
----

Flags: needinfo?(sledru)

Andrew Halberstadt [:ahal]

Assignee

Comment 13

•

10 years ago

Comment on attachment 8428874 [details] [diff] [review]
mozprocess unicode

:sylvestre, do you happen to know what character it is tripping up on? It would be good to add a test for it once we figure it out what the problem is. I'm not quite sure how to fix this in the general case.. maybe we should just take your initial try: except patch after all..

Attachment #8428874 - Flags: review?(gps)

Flags: needinfo?(sledru)

Sylvestre Ledru [:Sylvestre]

Comment 14

•

10 years ago

It fails on:
  Getting entry nbsp...Found  
  Getting entry iexcl...Found ¡
  Getting entry cent...Found ¢
  Getting entry pound...Found £
  Getting entry curren...Found ¤
  Getting entry yen...Found ¥

So, basic stuff.

(which I can see with my patch).

Flags: needinfo?(sledru)

Sylvestre Ledru [:Sylvestre]

Comment 15

•

10 years ago

Andrew, are you ok with my patch? thanks

Flags: needinfo?(ahalberstadt)

Andrew Halberstadt [:ahal]

Assignee

Comment 16

•

10 years ago

Attached patch Attempt to decode to utf-8, then iso8859-1 (obsolete) — Details — Splinter Review

No, but here's a variation on your patch that should do the same thing. This patch actually fails the test I created for the last patch, but I think we should go ahead and land this as it should fix the problem for most people who run into this. We can re-open if there are still problems with other random encodings in the future.

Attachment #8428713 - Attachment is obsolete: true

Attachment #8433453 - Flags: review?(gps)

Flags: needinfo?(ahalberstadt)

Andrew Halberstadt [:ahal]

Assignee

Comment 17

•

10 years ago

Attached patch Attempt to decode to utf-8, then iso8859-1 — Details — Splinter Review

Sorry, I don't think the last patch would ever re-raise the exception in that else clause. Here's an updated patch that does the right thing.

Attachment #8433453 - Attachment is obsolete: true

Attachment #8433453 - Flags: review?(gps)

Attachment #8433472 - Flags: review?(gps)

Gregory Szorc [:gps]

Updated

•

10 years ago

Attachment #8433472 - Flags: review?(gps) → review+

Andrew Halberstadt [:ahal]

Assignee

Comment 18

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/12536416d070

Ryan VanderMeulen [:RyanVM]

Comment 19

•

10 years ago

Backed out for breaking B2G mochitests.
https://hg.mozilla.org/integration/mozilla-inbound/rev/ad91b00b5af9

https://tbpl.mozilla.org/php/getParsedLog.php?id=41053086&tree=Mozilla-Inbound

Ryan VanderMeulen [:RyanVM]

Comment 20

•

10 years ago

And Cpp tests.
https://tbpl.mozilla.org/php/getParsedLog.php?id=41053472&tree=Mozilla-Inbound

Ryan VanderMeulen [:RyanVM]

Comment 21

•

10 years ago

And Gaia UI tests.
https://tbpl.mozilla.org/php/getParsedLog.php?id=41055208&tree=Mozilla-Inbound

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 22

•

10 years ago

And Marionette.
https://tbpl.mozilla.org/php/getParsedLog.php?id=41061576&tree=Mozilla-Inbound

Sylvestre Ledru [:Sylvestre]

Comment 23

•

10 years ago

Andrew, I tried on try with my attachment from bug 1009490: 
https://tbpl.mozilla.org/?tree=Try&rev=d4d7c66b4188

It seems green.

Andrew Halberstadt [:ahal]

Assignee

Comment 24

•

10 years ago

Ok I guess that won't make anything worse off than it already is, so if you want to get it landed in the meantime I'm ok with that (though it isn't a fix to this bug). I'd want to at least include a 'TODO' comment with a reference to this bug.

Would you mind re-uploading that patch with the comment included and flag me for review?

Sylvestre Ledru [:Sylvestre]

Comment 25

•

10 years ago

Attached patch workaround on the utf-8 display failing — Details — Splinter Review

Thanks!

Attachment #8450143 - Flags: review?(ahalberstadt)

Andrew Halberstadt [:ahal]

Assignee

Updated

•

10 years ago

Attachment #8450143 - Attachment is patch: true

Andrew Halberstadt [:ahal]

Assignee

Comment 26

•

10 years ago

Comment on attachment 8450143 [details] [diff] [review]
workaround on the utf-8 display failing

Review of attachment 8450143 [details] [diff] [review]:
-----------------------------------------------------------------

Lgtm, thanks!

Attachment #8450143 - Flags: review?(ahalberstadt) → review+

Andrew Halberstadt [:ahal]

Assignee

Comment 27

•

10 years ago

The above patch is a workaround that may improve things for people in western Europe, but will still fail in other cases. Leaving open to fix this properly (eventually).

Whiteboard: [leave-open]

Sylvestre Ledru [:Sylvestre]

Comment 28

•

10 years ago

Here it is:
https://hg.mozilla.org/integration/mozilla-inbound/rev/123397342cd1

Tim Taubert [:ttaubert] (inactive)

Comment 29

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/123397342cd1

Armen [:armenzg]

Updated

•

10 years ago

Blocks: 1050756

Andrew Halberstadt [:ahal]

Assignee

Updated

•

9 years ago

Assignee: ahalberstadt → nobody

Status: ASSIGNED → NEW

Jonathan Griffin (:jgriffin)

Updated

•

8 years ago

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → INCOMPLETE

Henrik Skupin [:whimboo][⌚️UTC+1]

Updated

•

6 years ago

Status: RESOLVED → REOPENED

Resolution: INCOMPLETE → ---

Sylvestre Ledru [:Sylvestre]

Updated

•

6 years ago

Keywords: leave-open

Whiteboard: [leave-open]

BugBot [:suhaib / :marco/ :calixte]

Comment 30

•

5 years ago

The leave-open keyword is there and there is no activity for 6 months.
:gbrown, maybe it's time to close this bug?

Flags: needinfo?(gbrown)

Geoff Brown [:gbrown]

Comment 31

•

5 years ago

:whimboo -- Did you want to follow-up here?

Flags: needinfo?(gbrown) → needinfo?(hskupin)

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 32

•

5 years ago

The workaround fix landed in comment 28 and comment 29. Since then no more work was done / necessary. So I would say lets close this bug as fixed, and if necessary file a new bug for a real fix. For now the Unicode work stalled anyway.

Assignee: nobody → ahal

Status: REOPENED → RESOLVED

Closed: 8 years ago → 5 years ago

Flags: needinfo?(hskupin)

Keywords: leave-open

Resolution: --- → FIXED

Version: unspecified → 32 Branch

You need to log in before you can comment on or make changes to this bug.