Closed Bug 235230 Opened 20 years ago Closed 10 years ago

nsIX509Cert::windowTitle has the wrong IDL type

Categories

(Core :: XPConnect, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla32

People

(Reporter: jgmyers, Assigned: Cykesiopka)

References

Details

Attachments

(1 file, 1 obsolete file)

 
This bug derives from and blocks bug 234856.

I need a way for xpconnect to correctly convert UTF-8 in a (char *).

In security/manager/ssl/public/nsIX509Cert.idl, an interface marked frozen,
there is declared:

readonly attribute string windowTitle;

The value of this attribute is encoded in UTF-8, and has always been encoded in
UTF-8.  When called from JS, xpconnect converts the value as if it were encoded
in ISO-8859-1.

I see three possible solutions:

1) Modify xpconnect so that all string values are treated as being in UTF-8
instead of ISO-8859-1.

2) Extend xpconnect to permit marking particular string values as being in UTF-8
and then so mark the above attribute

3) Break the freeze guarantee, changing the type of windowTitle to an AUTF8String.
Blocks: 234856
Could you not define a wrapper interface that would provide the string as
AUTF8String? The implementation of that interface would just forward and know
that the string is UTF8 encoded and just pass it through? I don't know enough
about the code and problem at hand but thought it might be another possible
alternative. 
More straightforward to create an nsIX509Cert2 interface and abandon the old
one.  But that does nothing for any emedders that call the old interface.
> 3) Break the freeze guarantee, changing the type of windowTitle to an 
> AUTF8String.

Breaking frozen interfaces is not an option :-(


> More straightforward to create an nsIX509Cert2 interface and abandon the old
> one.  But that does nothing for any emedders that call the old interface.

See nsIGlobalHistory and nsIGlobalHistory2 for a similar story :-(

I would recommend making nsIX509Cert::windowTitle return an ASCII compatible
string even if that means truncating/munging the real UTF-8 value.  It is
probably better to do that than risk handing some other code a non-ASCII string
when it expects ASCII.  (e.g., someone might try to treat your ASCII valued
windowTitle as UTF-8.)

nsIX509Cert2 is probably the way to go.
(In reply to comment #1)
> I see three possible solutions:
> 
> 1) Modify xpconnect so that all string values are treated as being in UTF-8
> instead of ISO-8859-1.

What about:

1a) Modify xpconnect so that string values are treated as being in UTF-8, and
falling back to ISO-8859-1 if they turn out not to be valid UTF-8?
of course that would break a few valid ISO-8859-1 strings. Also, as embeddors
were mentioned, what if they already assume the string is latin1? they'd break
even for certificates that they could previously handle fine.
(In reply to comment #6)
> of course that would break a few valid ISO-8859-1 strings.

It's not easy to come up with valid Latin1 sequences that might occur in real
text and could also be valid UTF-8.  A few months ago on the Unicode list people
were trying to find examples, and the results were all rather contrived. I think
the best was NESTLɮ. For the whole string to be valid in both encodings you
also have to presuppose that it string contains one of these contrived
sequences, and not a single other character > 0x80.

> Also, as embeddors
> were mentioned, what if they already assume the string is latin1? they'd break
> even for certificates that they could previously handle fine.

I'm not sure what you mean here. Can you expand?
I'd vote for nsIX509Cert2, and let embedders deal with it. We need some new
methods for certs anyway (e.g. IsCACert()).
It's not as if the current implementation of the interface passes valid
ISO-8859-1 strings through the windowTitle interface.  The implementation
currently always passes UTF-8.  What is being asked for is a way to change the
existing frozen interface to match the existing implementation.
> I'm not sure what you mean here. Can you expand?

given comment 9 my example isn't an issue.
sorry, the two functions have different signatures, so all of a sudden you'd have functions which would probably crash. and since we're talking about c++, not js, you can't protect anyone from them.

we could create a new tagging bit, but we're very low on those which would tell xpconnect and similar that the char* should be treated as utf8. but we're very low on those bits and then suddenly the c++ side is at a disadvantage, it used to know that char* meant latin1 and utf8 was done via some flavor of ACString, now it wouldn't know that.

--
I was about to send this when I realized that in a conversation between plasticmillion, smontagu, shaver, and myself it of course was mentioned that technically AString and friends *don't* declare encoding so c++ code really doesn't know what to do with it. And of course, xpconnect happens to have two rules, one for DOMString and a different one for AString, which is why plasticmillion cares, the encoding stuff will byte him and everyone else no matter what, unless we give in and stop accepting shaver's claim that you don't need to know the type of the data. (or until we switch to some impractical method where you can pass bytearrays.)

Perhaps the solution for plasticmillion's problem is a bytearray property which maps to AString (c++) and string (js) with a promise that neither js nor xpconnect will do any conversions on tokens [if string would involve some conversions, then it'd have to be array instead].

Anyway, I'm headed off to sweden for the weekend to visit luser. I'll be back monday, so don't start too many fires or fireworks without me.
sorry, the two functions have different signatures, so all of a sudden you'd have functions which would probably crash. and since we're talking about c++, not js, you can't protect anyone from them.

we could create a new tagging bit, but we're very low on those which would tell xpconnect and similar that the char* should be treated as utf8. but we're very low on those bits and then suddenly the c++ side is at a disadvantage, it used to know that char* meant latin1 and utf8 was done via some flavor of ACString, now it wouldn't know that.

--
I was about to send this when I realized that in a conversation between plasticmillion, smontagu, shaver, and myself it of course was mentioned that technically AString and friends *don't* declare encoding so c++ code really doesn't know what to do with it. And of course, xpconnect happens to have two rules, one for DOMString and a different one for AString, which is why plasticmillion cares, the encoding stuff will byte him and everyone else no matter what, unless we give in and stop accepting shaver's claim that you don't need to know the type of the data. (or until we switch to some impractical method where you can pass bytearrays.)

Perhaps the solution for plasticmillion's problem is a bytearray property which maps to AString (c++) and string (js) with a promise that neither js nor xpconnect will do any conversions on tokens [if string would involve some conversions, then it'd have to be array instead].

Anyway, I'm headed off to sweden for the weekend to visit luser. I'll be back monday, so don't start too many fires or fireworks without me.
Timeless, I'm not sure that you accurately describe the situation. First of all AString is a 16-bit string that is guaranteed to be Unicode. The real issue is with 8-bit strings. The good news is that XPConnect appears to handle them fine as long as they are declared as AUTF8String. ACString (in IDL) is apparently an artefact and there doesn't seem to be any good reasont to use.

So I'm not adding anything new: the proposed solution to create a new version of the interface using AUTF8String seems correct to me.
Assignee: dbradley → nobody
QA Contact: pschwartau → xpconnect
Depends on: 643041
This is fixed by the patch I'm about to post in bug 643041.
Assignee: nobody → zackw
Status: NEW → ASSIGNED
Summary: Need support for utf8 in "string" → nsIX509Cert::windowTitle has the wrong IDL type
Hi zwol, do you still plan to work on this bug or Bug 643041? nsIX509Cert has been changed for Firefox 32, so it seems like it might be a good time to do any compat breaking stuff...
Flags: needinfo?(zackw)
The patch in bug 643041 would need to be dusted off, but it was a complete fix for the bug at the time.  I shelved it because I was told that nsIX509Cert could not be changed (on addon-compat grounds) even though it was wrong.  If that is no longer the case, great; unfortunately, my Mozilla-hacking time is extremely limited for the foreseeable future and there are some other things that have higher priority.  I would not stand in the way of someone else picking it up.  Note that the patch in 643401 does more than is strictly required to address this bug.
Flags: needinfo?(zackw)
(In reply to Zack Weinberg (:zwol) from comment #16)
> The patch in bug 643041 would need to be dusted off, but it was a complete
> fix for the bug at the time.  I shelved it because I was told that
> nsIX509Cert could not be changed (on addon-compat grounds) even though it
> was wrong.  If that is no longer the case, great; unfortunately, my
> Mozilla-hacking time is extremely limited for the foreseeable future and
> there are some other things that have higher priority.  I would not stand in
> the way of someone else picking it up.  Note that the patch in 643401 does
> more than is strictly required to address this bug.

Thanks!
I'll pick the relevant changes from the Bug 643041 patch for this bug then.
This patch is based on the patch posted in Bug 643041.
Attachment #8431364 - Flags: review?(dkeeler)
Comment on attachment 8431364 [details] [diff] [review]
bug235230_v1.patch; Original patch by Zack Weinberg

Review of attachment 8431364 [details] [diff] [review]:
-----------------------------------------------------------------

Great! r=me with comments addressed.

::: security/manager/ssl/public/nsIX509Cert.idl
@@ +127,5 @@
>  
>    /**
>     *  A human readable identifier to label this certificate.
>     */
> +  readonly attribute AString windowTitle;

I believe the uuid needs to be rev'd with this change.

::: security/manager/ssl/src/nsNSSCertificate.cpp
@@ +540,5 @@
> +
> +  if (mCert->nickname) {
> +    CopyUTF8toUTF16(mCert->nickname, aWindowTitle);
> +  } else {
> +    char* commonName = CERT_GetCommonName(&mCert->subject);

Using 'ScopedPtr<char, PORT_Free_string>' would be best here (try mozilla::psm::ScopedPtr, mozilla::psm::PORT_Free_string if it doesn't work without specifying the namespace)

@@ +542,5 @@
> +    CopyUTF8toUTF16(mCert->nickname, aWindowTitle);
> +  } else {
> +    char* commonName = CERT_GetCommonName(&mCert->subject);
> +    if (commonName) {
> +      CopyUTF8toUTF16(commonName, aWindowTitle);

It turns out, CERT_GetCommonName can return data that claims it is UTF-8 but isn't really. Looks like you can use IsUTF8 to verify the string before calling CopyUTF8toUTF16. We should probably do the same any time a UTF8-UTF16 conversion is made.

::: security/manager/ssl/src/nsNSSCertificateFakeTransport.cpp
@@ +45,5 @@
>    NS_NOTREACHED("Unimplemented on content process");
>    return NS_ERROR_NOT_IMPLEMENTED;
>  }
>  
>  /* readonly attribute string windowTitle; */

nit: might as well remove this comment
Attachment #8431364 - Flags: review?(dkeeler) → review+
(In reply to David Keeler (:keeler) [use needinfo?] from comment #19)
> I believe the uuid needs to be rev'd with this change.

Yes, silly omission on my part.

> It turns out, CERT_GetCommonName can return data that claims it is UTF-8 but
> isn't really. Looks like you can use IsUTF8 to verify the string before
> calling CopyUTF8toUTF16. We should probably do the same any time a
> UTF8-UTF16 conversion is made.

IRC chat expanding on this a bit:

21:10	Cykesiopka	keeler: wrt the isUTF8 comment in Bug 235230 - do you want me to fallback and assume ASCII if the string isn't UTF-8?
21:11	keeler	Cykesiopka: hmmm - I think we'll have the same kind of problem if it isn't ASCII either
21:11	keeler	why not just pretend that field doesn't exist if it isn't encoded properly?
21:12	keeler	(i.e. go on to subjectName and then email, and return an empty string if all else fails)
21:15	Cykesiopka	keeler: ok, sgtm
Assignee: zackw → cykesiopka.bmo
(In reply to David Keeler (:keeler) [use needinfo?] from comment #19)
> 
> It turns out, CERT_GetCommonName can return data that claims it is UTF-8 but
> isn't really. Looks like you can use IsUTF8 to verify the string before
> calling CopyUTF8toUTF16.

I was going to say that CopyUTF8toUTF16 will emit U+FFFD REPLACEMENT CHARACTER for each invalid UTF-8 sequence, but on digging into the code a little, it actually truncates at the first error.  Feh.  Filed bug 1018793.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/1137d4bcc40a
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: