Closed Bug 652952 Opened 13 years ago Closed 13 years ago

jsbridge has issues with non-unicode strings with addons

Categories

(Testing Graveyard :: Mozmill, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: k0scist, Assigned: k0scist)

Details

(Whiteboard: [mozmill-2.0+])

Attachments

(3 files, 1 obsolete file)

Sending e.g. ASCII data over the jsbridge fails, bigtime:

Component returned failure code: 0x8050000e [nsIScriptableUnicodeConverter.ConvertToUnicode]

Yikes!  With such text as this: "DivX® Web Player"

I'm not exactly sure exactly where or how to fix it but....
Summary: jsbridge has issues with unicode → jsbridge has issues with non-unicode strings
Whiteboard: [mozmill-2.0?] → [mozmill-2.0+]
The error is actually when we send it over the bridge, specifically here: https://github.com/mozautomation/mozmill/blob/master/jsbridge/jsbridge/extension/resource/modules/nspr-server.js#L138

We assert that all we have is unicode.  Obviously, this is not the case :(

:ctalbert pointed me to https://developer.mozilla.org/en/Reading_textual_data . I'll investigate this.  The alternative seems to be character-set detection since, abict, we can't tell what charset e.g. addon strings are in.  This may be a different bug (e.g. a core bug) if they are supposed to be UTF-8. But in any case, we should be smart and not wait for a upstream fix.
Even catching the exception causes a problem. I wrap the offending line in a try-catch block

https://github.com/mozautomation/mozmill/blob/master/jsbridge/jsbridge/extension/resource/modules/nspr-server.js#L141

and return something innocuous.  JSBridge times out. There is no error in the error console. I don't know why it fails :(
An attempt to get this working.  I have tried to follow the instructions in https://developer.mozilla.org/en/Reading_textual_data but as is, I get the same failure code.  I have also just tried returning `_text` following the "conversion", but to no avail. Dumping the offending data, the offending glyph `®` appears, which makes me think the conversion failed. But....python seems to think otherwise?::

>>> foo = '®'
>>> foo
'\xc2\xae'
>>> print foo
®
>>> foo.decode('utf-8')
u'\xae'
>>> print foo.decode('utf-8')
®

I'm not really a unicode expert.  Will futz around more from here, but I feel pretty lost.  Another approach is to do charset detection (and obliterate the several places we assert UTF-8)
Assignee: nobody → jhammel
(In reply to comment #3)
> Created attachment 531397 [details] [diff] [review] [review]
> failing attempt to convert unicode lossily
> 
> An attempt to get this working.  I have tried to follow the instructions in
> https://developer.mozilla.org/en/Reading_textual_data but as is, I get the
> same failure code.  I have also just tried returning `_text` following the
> "conversion", but to no avail. 

No avail means here that mozmill just hangs. There are no errors in the browser. You eventually get a JS bridge timeout
Notably also, when giving a 0x0000 as from https://developer.mozilla.org/en/XPCOM_Interface_Reference/nsIConverterInputStream , none of the behaviour changes. The initial conversion (is.init(fis, charset, 1024, 0x0000); var str = {}; var _text = ''; while (is.readString(4096, str) != 0) { _text += str.value; } is.close()) goes fine, but the error remains on the line text text = converter.ConvertToUnicode(_text); . So something is behaving badly
Going to blow away my changes and try again.  Not having much progress.  Things I have tried:

* using Components.interfaces.nsIConverterInputStream to convert to utf-8 with a replacement character

* using Components.interfaces.nsIUTF8ConverterService

* casting to part of a URI and decoding that

* same as above but with variations on conversion

Worse, it doesn't seem you can try: catch this error, making the problem unpuntable.

I tried to ask a few developers how to do this.  When they were showed the nspr-server code, their first question was "Why do you want to do this?"  The only answer I could give was "I have no idea".  It is undocumented why we use toUnicode.  We should figure that out first, so I'll blow my changes away and try that direction but will keep the debug code I have going for future archaeologists.
Indeed without the toUnicode calls, the code just hangs.  Not sure why this is
(In reply to comment #7)
> Indeed without the toUnicode calls, the code just hangs.  Not sure why this
> is

Though it does run with addons = "null";
With some detective work, I figured out (at least one place) where the harness hangs if you try/catch inside of nspr-server.js:

Error: Failed to decode base64 string!
Source File: resource://gre/components/nsUrlClassifierLib.js
Line: 1193

Note that what I said before about not being able to try/catch is true for "client" code (i.e. mozmill.js)....but you can inside of nspr-server.js (and likely that entire side of the bridge). It is not leverageable exactly, though: since we pass strings instead of objects, we can't exactly serialize them piece-wise.  That is another approach to consider.
Attached patch fix the symptom (obsolete) — Splinter Review
So it turns out I was wrong about at least one thing: you *can* try/catch from mozmill.js. I had evidently messed up the harness so badly at that point that I was witnessing artefacts. 

This patch stringifies the addons list converting to (and from!) utf-8. Replacements are dealt with by eliminating non-overlap areas of US-ASCII and utf-8 (char codes below 32 or above 127).

There is still alot fishy that we're doing in jsbridge.  While I hope to understand it someday, I'd like to try to move the logic coded here for addons to nspr-server.js, including stringifying objects via JSON with this encoder in the short term and maybe revisit later, which should at least give us robustness. Assuming that's successful I'll post a revised patch tomorrow, or if not I'm prepared to land what I have to cover the one known leaking hole.
Attachment #532119 - Flags: feedback?(ctalbert)
Oops! Didn't mean to leave the dump statement in :) Obviously wouldn't land it with that.
Identical to the previous patch.  While JSBridge could definitely use some cleanup, after poking at it for a few hours, I'm guessing this is another 40+ hours to do properly, vs the less than a day as I expected.  So let's fix the one known issue -- the addons string -- and I'll reticket the other issues separately
Attachment #532119 - Attachment is obsolete: true
Attachment #532119 - Flags: feedback?(ctalbert)
Attachment #532325 - Flags: review?(ctalbert)
Comment on attachment 532325 [details] [diff] [review]
fix the symptom, cleaned

Glorious hack.  Please file the follow on bug for the JS interface for python to call to serialize a js object to a string and send it back to python (the long term fix for this).

i.e.
[js]
var addons = AddonList; // A full JS object
[python]
results.addons # JSObject
results.addons_string = results.addons.serialize()
# serialize calls a JS api that translates the given JSObject into a JSON string, and sends that back to python. This string is what serialize returns.
Attachment #532325 - Flags: review?(ctalbert) → review+
(In reply to comment #13)
> Comment on attachment 532325 [details] [diff] [review] [review]
> fix the symptom, cleaned
> 
> Glorious hack.  Please file the follow on bug for the JS interface for
> python to call to serialize a js object to a string and send it back to
> python (the long term fix for this).

See https://bugzilla.mozilla.org/show_bug.cgi?id=657022 
and https://bugzilla.mozilla.org/show_bug.cgi?id=657020

> i.e.
> [js]
> var addons = AddonList; // A full JS object
> [python]
> results.addons # JSObject
> results.addons_string = results.addons.serialize()
> # serialize calls a JS api that translates the given JSObject into a JSON
> string, and sends that back to python. This string is what serialize returns.

We don't want a string here.  Since results.addons gets JSON serialized, it should be a python object.  We want to serialize to python, so

results.addons = mozmill.addons.serialize() # or maybe .to_python(), somethin'
Summary: jsbridge has issues with non-unicode strings → jsbridge has issues with non-unicode strings with addons
pushed to master: https://github.com/mozautomation/mozmill/commit/37686683d95747a34fea95c0851aedd1f8134303
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: