Closed Bug 1650915 Opened 7 months ago Closed 6 months ago

[mozdevice] get_file may fail with UnicodeDecodeError

Categories

(Testing :: Mozbase, defect, P1)

defect

Tracking

(firefox81 fixed)

RESOLVED FIXED
81 Branch
Tracking Status
firefox81 --- fixed

People

(Reporter: gbrown, Assigned: gbrown)

References

Details

Attachments

(1 file)

Trying to run mochitests in py3 with a wip patch, I hit:

Traceback (most recent call last):
  File "/home/gbrown/objdirs/x86_64/_tests/testing/mochitest/runtests.py", line 2954, in doTests
    e10s=options.e10s
  File "/home/gbrown/objdirs/x86_64/_tests/testing/mochitest/runtestsremote.py", line 321, in runApp
    ret, _ = self.automation.runApp(*args, **kwargs)
  File "/home/gbrown/objdirs/x86_64/_tests/testing/mochitest/remoteautomation.py", line 82, in runApp
    status = self.waitForFinish(timeout, maxTime)
  File "/home/gbrown/objdirs/x86_64/_tests/testing/mochitest/remoteautomation.py", line 132, in waitForFinish
    status = self.wait(timeout=maxTime, noOutputTimeout=timeout)
  File "/home/gbrown/objdirs/x86_64/_tests/testing/mochitest/remoteautomation.py", line 372, in wait
    hasOutput = self.read_stdout()
  File "/home/gbrown/objdirs/x86_64/_tests/testing/mochitest/remoteautomation.py", line 267, in read_stdout
    newLogContent = self.device.get_file(self.remoteLog, offset=self.stdoutlen)
  File "/home/gbrown/src/testing/mozbase/mozdevice/mozdevice/adb.py", line 2539, in get_file
    return six.ensure_str(tf2.read())
  File "/home/gbrown/src/third_party/python/six/six.py", line 899, in ensure_str
    s = s.decode(encoding, errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 0: invalid start byte

If we changed with io.open(tf.name, mode='rb') as tf2: to with io.open(tf.name, mode='r') as tf2: would that be sufficient?

(In reply to Bob Clary [:bc] from comment #1)

If we changed with io.open(tf.name, mode='rb') as tf2: to with io.open(tf.name, mode='r') as tf2: would that be sufficient?

I tried that and it didn't help -- I hit the same exception in the same place.

Bug 1642672 changed get_file to return ensure_str() of the new file content; that's consistent with other changes in that bug, but maybe we went too far? The mochitest log is being retrieved here, which is structured and contains an unusual sequence of bytes marking the separation of log records. Maybe it would be better (more flexible, though less convenient) for get_file to return bytes and make the caller responsible for decoding.

See Also: → 1642672

oh, yeah. :-( Man, I hate our logging. Sorry.

(In reply to Geoff Brown [:gbrown] from comment #3)

contains an unusual sequence of bytes marking the separation of log records

I was thinking of https://searchfox.org/mozilla-central/rev/91d82d7cbf05a71954dfa49d0e43824c7c973e62/testing/mochitest/tests/SimpleTest/TestRunner.js#258:

var LOG_DELIMITER = "\ue175\uee31\u2c32\uacbf";

When I log the bytes being passed to ensure_str in get_file(), I see cases like:

b's","js_source":"TestRunner.js"}\xee\x85\xb5\xee\xb8\xb1\xe2\xb0\xb2\xea\xb2\xbf\n\xee\x85\xb5\xee\xb8\xb1\xe2\xb0\xb2\xea\xb2\xbf{"action":"log","time":1594071679007,"thread":null,"pid":null,"source":"mochitest","level":"INFO","message":"Slowest: 4187ms - /tests/testing/mochitest/tests/Harness_sanity/test_sanity_waitForCondition.html","js_source":"TestRunner.js"}\xee\x85\xb5\xee\xb8\xb1\xe2\xb0\xb2\xea\xb2\xbf\n\xee\x85\xb5\xee\xb8\xb1\xe2\xb0\xb2\xea\xb2\xbf{"action":"log","time":1594071679008,"thread":null,"pid":null,"source":"mochitest","level":"INFO","message":"SimpleTest FINISHED","js_source":"TestRunner.js"}\xee\x85\xb5\xee\xb8\xb1\xe2\xb0\xb2\xea\xb2\xbf\n'

(which survives ensure_str), and:

b'\xb1\xe2\xb0\xb2\xea\xb2\xbf\n'

(which raises).

I am perturbed that I do not see the relationship between the logged bytes and LOG_DELIMITER, but maybe that's tangential...will look closer tomorrow.

(In reply to Geoff Brown [:gbrown] from comment #5)

I am perturbed that I do not see the relationship between the logged bytes and LOG_DELIMITER

LOG_DELIMITER.encode('utf-8') = \xee\x85\xb5\xee\xb8\xb1\xe2\xb0\xb2\xea\xb2\xbf

Make the caller responsible for any desired type conversion, so that binary data can be
retrieved. This just works in py2. 'mach gtest' is the only py3 caller, and it already calls
ensure_str on the returned file contents. mochitest/reftest (via remoteautomation.py) may
require additional consideration when they are converted to py3 (coming soon!?).

Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7c01df7faa16
[mozdevice] Do no type conversion on file contents returned by get_file; r=bc
Status: NEW → RESOLVED
Closed: 6 months ago
Resolution: --- → FIXED
Target Milestone: --- → 81 Branch
Regressions: 1658679
You need to log in before you can comment on or make changes to this bug.