Closed
Bug 256785
Opened 21 years ago
Closed 7 years ago
conversion from unicode to |string| should be stricter
Categories
(Core :: XPConnect, defect)
Core
XPConnect
Tracking
()
RESOLVED
INACTIVE
People
(Reporter: darin.moz, Unassigned)
Details
conversion from unicode to |string| should be stricter.
currently, xpconnect happily truncates unicode characters to 8-bits when passing
through a |string| IDL parameter. this obviously results in data loss if the
caller passes non-ASCII values to xpconnect. however, more than just data loss,
this creates major trouble for a lot of code. take this example:
interface nsIFoo {
void doStuff(in string aURL);
};
now, imagine that nsIFoo is implemented in C++, like so:
NS_IMETHODIMP
nsFoo::DoStuff(const char *aURL)
{
nsCOMPtr<nsIURI> uri;
NS_NewURI(getter_AddRefs(uri), nsDependentCString(aURL));
...
}
in this code, lurks trouble. if DoStuff is ever passed a non-ASCII URL string
from JavaScript code, that URL string will be destructively truncated to 8-bits
and the result will be passed to NS_NewURI. NS_NewURI expects a UTF-8 input
string, but in this example, aURL is not valid UTF-8. currently, necko deals
with this problem by validating the input. that solution works for necko, but
in effect it means that every XPCOM API, which uses AUTF8String, would need
similar IsUTF8 checks.
i think we might be well-served by having xpconnect perform a more strict
conversion from unicode to |string|. xpconnect could truncate to 7-bits or
perhaps it should throw an exception if the input is invalid! ;-)
see bug 250900 for reference.
Comment 1•21 years ago
|
||
There's a more general problem. When one type is being converted to another
type and the conversion result is out of range of the destination type. For
instance an int containing -1 converted to an unsigned int. Or an int
containing 40000 converted to a short. Should those throw an exception? This
has been brought up before and something in the back of my mind
One issue to consider is the performance hit we'll take by either trimming
every string to 7 bits or checking every character to see if its out of range.
If we're going to take such a hit when strings cross JS/XPCOM boundaries it
makes me wonder if we wouldn't be better off with a common string
representation and provide the appropriate rendering of the string when needed.
I wonder if that existed if the number of conversions would actually go down.
Reporter | ||
Comment 2•21 years ago
|
||
Yeah, performance concerns are the real kicker.
> If we're going to take such a hit when strings cross JS/XPCOM boundaries it
> makes me wonder if we wouldn't be better off with a common string
> representation and provide the appropriate rendering of the string when
> needed.
Of course, this isn't an option for |string| :-(
Comment 3•21 years ago
|
||
Right, it's unfortunate that the implementation of "string" was exposed by XPCOM.
Ignoring performance, the question still remains should this be an exception or
should it "truncate" similar to what currently happens with numbers?
Reporter | ||
Comment 4•21 years ago
|
||
I think we should try throwing an exception. This may cause problems for people
using nsIOutputStream::write, which probably should not be a scriptable method
in the first place.
for the record, i'm already paying an incredible performance penalty with the
stuff i do, at times we are reading/writing a character at a time, and
triggering threadsafety warnings for each poke. -- I will eventually post some
classinfo patches to reuces the number of threadsafety warnings.
further, we use nsIOutputStream nsITransport.openOutputStream. it's marked as
scriptable, is not deprecated and for all i know we have no alternatives.
afaik our code is both intl safe and binary safe, as we send arbitrary images
through our code and they render correctly (and we have qa who are picky enough
to complain about images).
Comment 6•21 years ago
|
||
You may be paying the cost, but is the rest of Mozilla? The majority of the
browser code may be happily ignorant and injecting this overhead in such a well
travelled place could noticeably slow things down. Not to say that that code is
safe and/or correct.
I'll try and get some measurements, but if anyone else is able to put a patch
together or willing to do time trials that would be a great help.
Reporter | ||
Comment 8•21 years ago
|
||
timeless: it sounds to me as if you should be wrapping your nsIOutputStream
instance with a nsIBinaryOutputStream instance. then you would be able to
safely write binary data to the stream. using nsIOutputStream::write to send
anything other than ASCII from JS is a bad idea.
Updated•19 years ago
|
Assignee: dbradley → nobody
Updated•19 years ago
|
QA Contact: pschwartau → xpconnect
Comment 9•7 years ago
|
||
Per policy at https://wiki.mozilla.org/Bug_Triage/Projects/Bug_Handling/Bug_Husbandry#Inactive_Bugs. If this bug is not an enhancement request or a bug not present in a supported release of Firefox, then it may be reopened.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INACTIVE
You need to log in
before you can comment on or make changes to this bug.
Description
•