Closed
Bug 887313
Opened 12 years ago
Closed 10 years ago
Starring bug 696306 causes an "missing low surrogate character in surrogate pair" error on tbpl
Categories
(Tree Management Graveyard :: TBPL, defect)
Tree Management Graveyard
TBPL
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: RyanVM, Unassigned)
Details
Attachments
(1 file)
37.03 KB,
patch
|
Details | Diff | Splinter Review |
When I star bug 696306 on tbpl (usually on b2g18), I get the following error at the top:
submitBugzillaComment.php error: Content-Type application/json had a problem with your request. ***ERROR*** missing low surrogate character in surrogate pair, at character offset 1971 ["dc00b, expected ab\n..."] at /usr/local/share/perl/5.10.0/Catalyst/Action/Deserialize/JSON.pm line 26, <$fh> line 1.
It's erroring out on the ± character in the bug summary.
the bzapi is throwing that error, not bugzilla. redirecting the needinfo to gerv.
Flags: needinfo?(gerv)
Comment 3•12 years ago
|
||
Glob found http://stackoverflow.com/q/4017976/953
Comment 4•12 years ago
|
||
This is dying in Deserialize, which means it doesn't like the JSON you are sending it. Is there any possibility of getting access to that? Then, we can make sure it's actually sending valid JSON. If it is, then I'll look at fixing BzAPI in some way.
Gerv
Flags: needinfo?(gerv)
Comment 5•12 years ago
|
||
I'm not sure if I can easily recreate it without finding somewhere where this test failure has occurred again.
Comment 6•12 years ago
|
||
Surely you can reproduce this by adding the ± character to the summary of a bug and starring it? So, file a test bug which is designed to match a common test failure, and add a ± character to its summary. Use a testing instance of tbpl if you don't want to mess with the production one.
Gerv
Reporter | ||
Comment 7•12 years ago
|
||
Just hit the error on this push (WinXP debug M4):
https://tbpl.mozilla.org/?tree=Mozilla-B2g18&rev=6107703a0cdc
I would think we could capture it on the dev instance of tbpl.
Comment 8•12 years ago
|
||
https://tbpl-dev.allizom.org/?tree=Mozilla-B2g18&rev=6107703a0cdc
Request URL: https://tbpl-dev.allizom.org/php/submitBugzillaComment.php
Request Method: POST
Request body:
{
id=696306&comment=edmorley%23tbpl-dev%0Ahttps%3A%2F%2Ftbpl.mozilla.org%2Fphp%2FgetParsedLog.php%3Fid%3D25240510%26tree%3DMozilla-B2g18%0ARev3+WINNT+5.1+mozilla-b2g18+debug+test+mochitest-4+on+2013-07-12+15%3A13%3A59%0Arevision%3A+6107703a0cdc%0Aslave%3A+talos-r3-xp-114%0A%0A119+ERROR+TEST-UNEXPECTED-FAIL+%7C+%2Ftests%2Fdom%2Ftests%2Fmochitest%2Fwhatwg%2Ftest_bug500328.html+%7C+uncaught+exception+-+NS_ERROR_FAILURE%3A+Component+returned+failure+code%3A+0x80004005+(NS_ERROR_FAILURE)+%5BnsIDOMHistory.pushState%5D+at+http%3A%2F%2Fmochi.test%3A8888%2Ftests%2Fdom%2Ftests%2Fmochitest%2Fwhatwg%2Ftest_bug500328.html%3A459%0A1433+ERROR+TEST-UNEXPECTED-FAIL+%7C+%2Ftests%2Feditor%2Flibeditor%2Fbase%2Ftests%2Ftest_bug586662.html+%7C+Test+timed+out.%0A1434+ERROR+TEST-UNEXPECTED-FAIL+%7C+%2Ftests%2Feditor%2Flibeditor%2Fbase%2Ftests%2Ftest_bug586662.html+%7C+%5BSimpleTest.finish()%5D+waitForFocus()+was+called+a+different+number+of+times+from+the+number+of+callbacks+run.++Maybe+the+test+terminated+prematurely+--+be+sure+to+use+SimpleTest.waitForExplicitFinish().+-+got+1%2C+expected+0%0A1476+ERROR+TEST-UNEXPECTED-FAIL+%7C+%2Ftests%2Feditor%2Flibeditor%2Fbase%2Ftests%2Ftest_bug795785.html+%7C+Test+timed+out.%0A1477+ERROR+TEST-UNEXPECTED-FAIL+%7C+%2Ftests%2Feditor%2Flibeditor%2Fbase%2Ftests%2Ftest_bug795785.html+%7C+%5BSimpleTest.finish()%5D+waitForFocus()+was+called+a+different+number+of+times+from+the+number+of+callbacks+run.++Maybe+the+test+terminated+prematurely+--+be+sure+to+use+SimpleTest.waitForExplicitFinish().+-+got+1%2C+expected+0%0A6355+ERROR+TEST-UNEXPECTED-FAIL+%7C+%2Ftests%2Feditor%2Flibeditor%2Fhtml%2Ftests%2Ftest_bug332636.html+%7C+The+backspace+key+should+delete+the+UTF-16+surrogate+pair+correctly+-+got+axb%2C+expected+ab%0A6356+ERROR+TEST-UNEXPECTED-FAIL+%7C+%2Ftests%2Feditor%2Flibeditor%2Fhtml%2Ftests%2Ftest_bug332636.html+%7C+The+backspace+key+should+delete+the+UTF-16+surrogate+pair+correctly+-+got+a%CC%88b%2C+expected+ab%0A6357+ERROR+TEST-UNEXPECTED-FAIL+%7C+%2Ftests%2Feditor%2Flibeditor%2Fhtml%2Ftests%2Ftest_bug332636.html+%7C+The+backspace+key+should+delete+the+UTF-16+surrogate+pair+correctly+-+got+a%F0%90%90%80b%2C+expected+ab%0A6358+ERROR+TEST-UNEXPECTED-FAIL+%7C+%2Ftests%2Feditor%2Flibeditor%2Fhtml%2Ftests%2Ftest_bug332636.html+%7C+The+backspace+key+should+delete+the+UTF-16+surrogate+pair+correctly+-+got+a%F0%90%A8%8Fb%2C+expected+ab%0A6415+ERROR+TEST-UNEXPECTED-FAIL+%7C+%2Ftests%2Feditor%2Flibeditor%2Fhtml%2Ftests%2Ftest_bug410986.html+%7C+Test+timed+out.%0A6416+ERROR+TEST-UNEXPECTED-FAIL+%7C+%2Ftests%2Feditor%2Flibeditor%2Fhtml%2Ftests%2Ftest_bug410986.html+%7C+%5BSimpleTest.finish()%5D+waitForFocus()+was+called+a+different+number+of+times+from+the+number+of+callbacks+run.++Maybe+the+test+terminated+prematurely+--+be+sure+to+use+SimpleTest.waitForExplicitFinish().+-+got+1%2C+expected+0%0A6422+ERROR+TEST-UNEXPECTED-FAIL+%7C+%2Ftests%2Feditor%2Flibeditor%2Fhtml%2Ftests%2Ftest_bug414526.html+%7C+Test+timed+out.%0A6423+ERROR+TEST-UNEXPECTED-FAIL+%7C+(SimpleTest%2FTestRunner.js)+%7C+4+test+timeouts%2C+giving+up.%0A6424+ERROR+TEST-UNEXPECTED-FAIL+%7C+(SimpleTest%2FTestRunner.js)+%7C+Skipping+466+remaining+tests.%0A6425+ERROR+TEST-UNEXPECTED-FAIL+%7C+(SimpleTest%2FTestRunner.js)+%7C+%5BSimpleTest.finish()%5D+waitForFocus()+was+called+a+different+number+of+times+from+the+number+of+callbacks+run.++Maybe+the+test+terminated+prematurely+--+be+sure+to+use+SimpleTest.waitForExplicitFinish().+-+got+1%2C+expected+0%0A6426+ERROR+TEST-UNEXPECTED-FAIL+%7C+(SimpleTest%2FTestRunner.js)+%7C+%2Ftests%2Feditor%2Flibeditor%2Fhtml%2Ftests%2Ftest_bug414526.html+finished+in+a+non-clean+fashion%2C+probably+because+it+didn't+call+SimpleTest.finish()
}
Response:
{
Content-Type application/json had a problem with your request. ***ERROR*** missing low surrogate character in surrogate pair, at character offset 1982 ["dc00b, expected ab\n..."] at /usr/local/share/perl/5.10.0/Catalyst/Action/Deserialize/JSON.pm line 26, <$fh> line 1.
}
I'll derive the bzapi call now.
Comment 9•12 years ago
|
||
Reduced:
POST request to https://tbpl-dev.allizom.org/php/submitBugzillaComment.php
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
Body:
{
id=696306&comment=%F0%90%A8%8F
}
Will now try locally and annotate https://hg.mozilla.org/webtools/tbpl/file/991419b41511/php/submitBugzillaComment.php
Comment 10•12 years ago
|
||
At first glance those chars seem very odd to me as a UTF-8 sequence:
http://software.hixie.ch/utilities/cgi/unicode-decoder/utf8-decoder?encoding=hex&bytes=F090A88F
Gerv
Comment 11•12 years ago
|
||
POST request to:
https://api-dev.bugzilla.mozilla.org/latest/bug/696306/comment?username=tbplbot%40gmail.com&password=<snip>
Accept: application/json
Content-Type: application/json
Post body:
{"text":"\ud802de0f"}
Response:
Status 400
Response body:
Content-Type application/json had a problem with your request.
***ERROR***
missing low surrogate character in surrogate pair, at character offset 15 ["de0f"}"] at /usr/local/share/perl/5.10.0/Catalyst/Action/Deserialize/JSON.pm line 26, <$fh> line 1.
Comment 12•12 years ago
|
||
Should we be sanitizing that more, or should bzapi handle it?
Comment 13•12 years ago
|
||
You are sending invalid JSON - json.org says the only valid Unicode escapes in JSON are exactly 4 hex chars, and this one is 8. So that's wrong to start with. Secondly, decoding \ud802de0f makes no sense:
http://software.hixie.ch/utilities/cgi/unicode-decoder/utf8-decoder?encoding=hex&bytes=d802de0f
So something has got corrupted somewhere along the way. Note that the sequence there is _different_ to the one in the post to submitBugzillaComment.php, which is:
http://software.hixie.ch/utilities/cgi/unicode-decoder/utf8-decoder?encoding=hex&bytes=F090A88F
That one is at least a valid Unicode character, although a very strange one: U+10A0F KHAROSHTHI SIGN VISARGA. If you have a font for this, you are a better man than I am: "
Comment 14•12 years ago
|
||
I manually reduced the string to get that in comment 11, so guess I just chopped off too much.
Comment 15•12 years ago
|
||
Oops, Bugzilla chopped my comment at the unicode char :-( The rest of it read:
Looking closer, this seems to be a test related to odd Unicode so perhaps this is in fact the character expected. Still, it's getting mangled by submitBugzillaComment.php into something which is definitely not right.
Only a few characters _have_ to be escaped in JSON; perhaps it would be easier to send the chars directly?
Gerv
(But maybe that's not such a great idea with the Bugzilla Unicode Chopping Bug still outstanding...)
Comment 16•12 years ago
|
||
To reiterate the story so far: it is the KHAROSHTHI SIGN VISARGA that we're mangling. This character is showing up in the test failure - its appearance is while the test failed, after all.
jQuery.ajax is sending UTF-8 - and KHAROSHTHI... is F090A88F in UTF-8 as Gerv noted in comment 13.
It looks like TBPL's PHP back end is trying to re-encode as UTF-16, and KARAOSHTHI in UTF-16 is low/high 0xd802 0xde0f (http://www.fileformat.info/info/unicode/char/10a0f/index.htm)
>Post body:
>{"text":"\ud802de0f"}
It almost got there, but for a missing \u in the middle there.
It looks like http://hg.mozilla.org/webtools/tbpl/file/991419b41511/php/inc/JSON.php#l337 is producing the incorrect output, not inserting a \u between the surrogate pairs.
I guess we can fix this, and the other cases, but what version of PHP are we running on tbpl.m.o? If >= 5.2.0 we could switch to PHP's native JSON support, which one would presume is less buggy. (Certainly on my machine:
php -r 'echo json_encode(chr(0xf0) . chr (0x90) . chr(0xa8). chr(0x8f));'
produces the correct "\ud802\ude0f").
Comment 17•12 years ago
|
||
I was always in favor of going with the native php function. I think ehsan had some objections, can’t remember exactly.
Comment 18•12 years ago
|
||
I pushed http://hg.mozilla.org/webtools/tbpl/rev/999b3ed2b3a3 to ask tbpl-dev what version it was running, and backed it out after the cron job ran and I got my answer.
Turns out I didn't need to. A grep for JSON reveals we are already using PHPs native JSON support all over the place, with the exception of 2 files, one of which we obviously know is submitBugzillaComment.php. It's nonsensical to have these files using this buggy library file. Let's switch.
Attachment #777277 -
Flags: review?(emorley)
Comment 19•12 years ago
|
||
(In reply to Arpad Borsos (Swatinem) from comment #17)
> I was always in favor of going with the native php function. I think ehsan
> had some objections, can’t remember exactly.
Ah. Didn't see your comment until I posted the patch.
Comment 20•12 years ago
|
||
Attachment #777277 -
Flags: review?(emorley)
Comment 21•12 years ago
|
||
The native PHP JSON function at least used to be pretty broken, not sure if things have improved since three years ago (I have pretty much stopped following PHP development.) Specifically, please retest bug 561985 comment 2 at the very least.
Comment 22•12 years ago
|
||
Note that we can also fix the Services_JSON bug and upstream that! <http://pear.php.net/pepr/pepr-proposal-show.php?id=198>
Comment 23•12 years ago
|
||
:graememcc: do you need to ask someone to review your patch?
Gerv
Summary: Starring bug 696306 causes an error on tbpl → Starring bug 696306 causes an "missing low surrogate character in surrogate pair" error on tbpl
Comment 24•11 years ago
|
||
:graememcc: ping?
Gerv
Comment 25•11 years ago
|
||
(In reply to Gervase Markham [:gerv] from comment #23)
> :graememcc: do you need to ask someone to review your patch?
>
> Gerv
I cancelled the initial review request as I was unaware of the previous history (this approach has been tried before and caused problems). I haven't yet had a chance to perform the suggested in comment 21. I'll try to get to this as soon as I can post-Summit, and re-request review or obsolete as appropriate.
Comment 26•11 years ago
|
||
(Or bug 561985 comment 18 :-))
Assignee | ||
Updated•10 years ago
|
Product: Webtools → Tree Management
Comment 27•10 years ago
|
||
Wontfix since TBPL is being replaced by Treeherder.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
Assignee | ||
Updated•10 years ago
|
Product: Tree Management → Tree Management Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•