Rename xpidl's ACString to BinaryData, audit uses

NEW
Unassigned

Status

()

4 years ago
a year ago

People

(Reporter: ted, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

We have a bunch of stuff that takes "in ACString" parameters in xpidl:
https://dxr.mozilla.org/mozilla-central/search?q="in+ACString"+ext%3Aidl&case=true

This turned out to be the root cause of bug 1166759--a caller was passing a serialized JSON string into this method which contained non-ASCII. I'm not 100% sure on XPConnect's handling of the data here but it results in broken string data.

We shouldn't allow this. We should either remove ACString as an xpidl type (and switch everything using it to use AUTF8String), or make XPConnect throw if a scriptable caller passes non-ASCII data in an ACString parameter.
We've had bugs in telemetry triggered by this too.

Comment 2

4 years ago
The purpose of ACString was actually to pass around binary blobs, before we had anything like typed arrays. What you get is a JS string of octet values, basically.

Let's audit the places where ACString is still used to see whether they actually expect string or binary data? We could spend time on passing binary data as typed arrays instead of strings, but I expect the work would be significant and we should just be expedient about this. Renaming it from ACString to BinaryData might be enough to put off people from using it by accident.
The xpidl "string" type gives you const char*, which seems more useful for that purpose. Getting an nsACString that's binary data seems...bad.

Renaming it to BinaryData would be okay, I suppose. I agree that fixing up all the usage to do the right thing is kind of a losing battle.

(I have a pipe dream of replacing all our xpidl with WebIDL, but that's not going to happen any time soon.)
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #3)
> The xpidl "string" type gives you const char*, which seems more useful for
> that purpose. Getting an nsACString that's binary data seems...bad.

Manually managing const char*'s is super error-prone, we shouldn't do it.
 
> Renaming it to BinaryData would be okay, I suppose. I agree that fixing up
> all the usage to do the right thing is kind of a losing battle.

I think there's also a valid use case for a string type that throws for non-ascii characters (as opposed to doing UTF8 conversion). The best solution IMO would be to add a new one (BinaryData would have the current semantics of ACString, and ACString would have the semantics of throwing for non-ascii data). Not sure if it's worth the effort though.
 
> (I have a pipe dream of replacing all our xpidl with WebIDL, but that's not
> going to happen any time soon.)

Aren't we also concerned with the codesize implications for doing that on infrequently-used interfaces?
(In reply to Bobby Holley (:bholley) from comment #4)
> I think there's also a valid use case for a string type that throws for
> non-ascii characters (as opposed to doing UTF8 conversion). The best
> solution IMO would be to add a new one (BinaryData would have the current
> semantics of ACString, and ACString would have the semantics of throwing for
> non-ascii data). Not sure if it's worth the effort though.

If someone needs it in the future they can figure it out.

> > (I have a pipe dream of replacing all our xpidl with WebIDL, but that's not
> > going to happen any time soon.)
> 
> Aren't we also concerned with the codesize implications for doing that on
> infrequently-used interfaces?

xpidl is horrible and every time I have to do anything nontrivial with it I die a little bit on the inside. I'd trade some codesize for not dealing with that. Maybe we could figure out a way to be smarter here, like extending or rewriting xptcall to match WebIDL semantics? That discussion merits a separate bug.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #5)
> (In reply to Bobby Holley (:bholley) from comment #4)
> > I think there's also a valid use case for a string type that throws for
> > non-ascii characters (as opposed to doing UTF8 conversion). The best
> > solution IMO would be to add a new one (BinaryData would have the current
> > semantics of ACString, and ACString would have the semantics of throwing for
> > non-ascii data). Not sure if it's worth the effort though.
> 
> If someone needs it in the future they can figure it out.

My point is that this is a lot of the current usage, I think. Most of consumers could probably suck it up and deal with / ignore UTF8, but we probably need to look at each consumer individually rather than mass-converting them all to BinaryData.

> > Aren't we also concerned with the codesize implications for doing that on
> > infrequently-used interfaces?
> 
> xpidl is horrible and every time I have to do anything nontrivial with it I
> die a little bit on the inside. I'd trade some codesize for not dealing with
> that. Maybe we could figure out a way to be smarter here, like extending or
> rewriting xptcall to match WebIDL semantics? That discussion merits a
> separate bug.

There's a pretty big long-tail of stuff that works that's hooked up with XPIDL, much of which is infrequently-used. Switching everything to WebIDL is both a large human cost and a non-trivial codesize cost, so I don't think we should bother - just write new stuff with WebIDL if it feels appropriate.

There are also various hybrid strategies like defining a dictionary in WebIDL and passing it as a jsval to an XPIDL interface (I just did this for nsScriptSecurityManager, for example).
Just making ACString throw on non-ASCII data would break the intended use.

Using "string" would break on data with embedded null.

Renaming ACString to BinaryData and auditing all uses would make sense.

Note that we have a _ton_ of ACString in our xpidl.  :(  Some of these should be AUTF8String (e.g. things that are passing in URI strings, or most of the stuff hanging off nsIXULAppInfo).  But some (e.g. the readLine on nsILineInputStream) are actual binary data.
Okay, let's resummarize then. The plan of record would be:
1) Rename ACString -> BinaryData
2) Audit all uses to ensure that's what they meant
3) Change anything that was expecting string data to use AUTF8String (which should be an IDL-only change).

It'd be even nicer if we could rewrite the xpconnect bits to pass the renamed BinaryData as nsTArray<uint8_t>, but that's a lot more work.
Summary: We should remove ACString from xpidl, or fix it to throw on non-ASCII input → Rename xpidl's ACString to BinaryData, audit uses
You need to log in before you can comment on or make changes to this bug.