Closed Bug 1249004 Opened 7 years ago Closed 7 years ago

UnicodeDecodeError when attempting to push to MozReview repository


(MozReview Graveyard :: General, defect)

Not set


(Not tracked)



(Reporter: mconley, Assigned: gps)



(1 file)

:rpl hit this:

Are you sure you want to push to remote? (y/n): y
pushing to ssh://
searching for changes
remote: adding changesets                                                                                                                                                                      
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 1 changes to 1 files (+1 heads)
remote: recorded push in pushlog
submitting 1 changesets for review
remote: ** unknown exception encountered, please report by visiting
remote: **
remote: ** Python 2.7.10 (default, May 28 2015, 09:58:55) [GCC 4.4.7 20120313 (Red Hat 4.4.7-11)]
remote: ** Mercurial Distributed SCM (version 3.6.2)
remote: ** Extensions loaded: pushlog, pushlog-feed, buglink, serverlog, rbserver
remote: Traceback (most recent call last):
remote:   File "/repo/hg/venv_pash/bin/hg", line 43, in <module>
remote:   File "/repo/hg/venv_pash/lib64/python2.7/site-packages/mercurial/", line 54, in run
remote:     sys.exit((dispatch(request(sys.argv[1:])) or 0) & 255)
remote:   File "/repo/hg/venv_pash/lib64/python2.7/site-packages/mercurial/", line 116, in dispatch                                                                      [104/46443]
remote:     ret = _runcatch(req)
remote:   File "/repo/hg/venv_pash/lib64/python2.7/site-packages/mercurial/", line 187, in _runcatch
remote:     return _dispatch(req)
remote:   File "/repo/hg/venv_pash/lib64/python2.7/site-packages/mercurial/", line 920, in _dispatch
remote:     cmdpats, cmdoptions)
remote:   File "/repo/hg/venv_pash/lib64/python2.7/site-packages/mercurial/", line 679, in runcommand
remote:     ret = _runcommand(ui, options, cmd, d)
remote:   File "/repo/hg/venv_pash/lib64/python2.7/site-packages/mercurial/", line 1051, in _runcommand
remote:     return checkargs()
remote:   File "/repo/hg/venv_pash/lib64/python2.7/site-packages/mercurial/", line 1011, in checkargs
remote:     return cmdfunc()
remote:   File "/repo/hg/venv_pash/lib64/python2.7/site-packages/mercurial/", line 917, in <lambda>
remote:     d = lambda: util.checksignature(func)(ui, *args, **cmdoptions)
remote:   File "/repo/hg/venv_pash/lib64/python2.7/site-packages/mercurial/", line 801, in check
remote:     return func(*args, **kwargs)
remote:   File "/repo/hg/venv_pash/lib64/python2.7/site-packages/mercurial/", line 5919, in serve
remote:     s.serve_forever()
remote:   File "/repo/hg/version-control-tools/hgext/serverlog/", line 321, in serve_forever
remote:     return super(sshserverwrapped, self).serve_forever()
remote:   File "/repo/hg/venv_pash/lib64/python2.7/site-packages/mercurial/", line 103, in serve_forever
remote:     while self.serve_one():
remote:   File "/repo/hg/version-control-tools/hgext/serverlog/", line 350, in serve_one
remote:     return super(sshserverwrapped, self).serve_one()
remote:   File "/repo/hg/venv_pash/lib64/python2.7/site-packages/mercurial/", line 121, in serve_one
remote:     rsp = wireproto.dispatch(self.repo, self, cmd)
remote:   File "/repo/hg/version-control-tools/hgext/serverlog/", line 342, in dispatch
remote:     return origdispatch(repo, proto, cmd)
remote:   File "/repo/hg/venv_pash/lib64/python2.7/site-packages/mercurial/", line 471, in dispatch
remote:     return func(repo, proto, *args)
remote:   File "/repo/hg/version-control-tools/hgext/reviewboard/hgrb/", line 186, in pushreviewwireprotocommand
remote:     res = _processpushreview(repo, req, ldap_username)
remote:   File "/repo/hg/version-control-tools/hgext/reviewboard/hgrb/", line 332, in _processpushreview
remote:     username=bzusername, apikey=bzapikey)
remote:   File "/repo/hg/version-control-tools/hgext/reviewboard/hgrb/", line 96, in post_reviews
remote:     return submit_reviews(*args, **kwargs)
remote:   File "/repo/hg/version-control-tools/hgext/reviewboard/hgrb/", line 125, in submit_reviews
remote:     commits=json.dumps(commits, encoding='utf-8'))
remote:   File "/usr/lib64/python2.7/json/", line 243, in dumps
remote:     return _default_encoder.encode(obj)
remote:   File "/usr/lib64/python2.7/json/", line 207, in encode
remote:     chunks = self.iterencode(o, _one_shot=True)
remote:   File "/usr/lib64/python2.7/json/", line 270, in iterencode
remote:     return _iterencode(o, 0)
remote: UnicodeDecodeError: 'utf8' codec can't decode byte 0xff in position 584: invalid start byte
abort: unexpected response: empty string
Summary: UnicodeDecodeError when attempting to push to MozReview repositoriy → UnicodeDecodeError when attempting to push to MozReview repository
I need to know which changesets / bugs you were attempted to push to.
Flags: needinfo?(mconley)
Another report in IRC. Pretty sure or 1 of its 2 descendants broke it. Hopefully I can repro...
Hey, new hire here. I was trying to push a simple one-liner to understand how this whole MozReview thing works: and hit a wall.

I made some other attempt after the first one:
I can repro with 4bf2ecabfd98.
Flags: needinfo?(mconley)
Assignee: nobody → gps
Mercurial represents diffs as byte sequences (str in Python).

Python's json encoder (like JSON) likes speaking in terms of Unicode.
Under the hood, it is very aggressive about converting str to unicode
and vice-versa, often applying a default encoding of UTF-8.

Since diffs can contain arbitrary byte sequences that aren't valid
UTF-8, the JSON encoding of a diff could fail. This was actually
happening in the wild.

It is possible to coerce Python+JSON into representing arbitrary byte
sequences. But it requires a lot of hoop jumping on both ends (e.g.
ensure_ascii=False, which the docs recommend against). It is much
simpler to base64 encode byte sequences *before* they are JSON
encoded. So that's what this commit does.

The "diff" commit JSON key has been changed to "diff_b64" to reflect
base64 encoded values. The Mercurial server is the only consumer of
the batch submission API, so we don't have to worry about backwards
compatibility. So we've dropped support for the "diff" key.

While I was here, I noticed that the "parent_diff" key is no longer
used, so I removed support for it.

Tests demonstrating preservation of binary data have been added.

Because YAML encodes binary data, a new mach debug command was added to
dump the raw bytes from Review Board's stored diffs. This itself
required hacks because mach will transparently UTF-8 encode sys.stdout.
Oy. In the end, we show that the raw diff stored in Review Board
contains the byte sequences captured by the filesystem/Mercurial.

Encoding is hard.

Review commit:
See other reviews:
Attachment #8721581 - Flags: review?(smacleod)
Product: Developer Services → MozReview
Comment on attachment 8721581 [details]
MozReview Request: mozreview: encode diffs as base64 over JSON API (bug 1249004); r?smacleod
Attachment #8721581 - Flags: review?(smacleod) → review+
Autolanded. Will get deployed soonish, hopefully.
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.