Implement nsAUTF8String and nsUTF8String

RESOLVED WONTFIX

Status

()

Core
String
RESOLVED WONTFIX
16 years ago
9 years ago

People

(Reporter: Nisheeth Ranjan, Assigned: jag (Peter Annema))

Tracking

({embed})

Trunk
Future
embed
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [adt3])

(Reporter)

Description

16 years ago
You know what I mean!  :-)
(Reporter)

Comment 1

16 years ago
When you implement this, please make sure that you go into xpcconvert.cpp and
fix up the places where we are currently doing extra string copying.

Thanks!

Updated

16 years ago
OS: Windows 2000 → All
Hardware: PC → All
(Assignee)

Updated

16 years ago
Keywords: nsbeta1, topembed
Target Milestone: --- → mozilla1.0

Comment 2

16 years ago
Nav triage team needs info: Why is this needed for MachV? (we don't know what
you mean ;-))
Whiteboard: [need info]
(Reporter)

Comment 3

16 years ago
At a high level, this is needed to complete out the implementation of new string
types in XPIDL that got checked in the 0.9.9 milestone.  The current situation
is that the AUTF8String type in XPIDL (used to store strings encoded in the UTF8
encoding) uses a simple C string class for storage which does not enforce UTF8
encoding on its contents.  The implementation of nsUTF8String is needed so that
the already added AUTF8String XPIDL type can start using the nsUTF8String
concrete class for storage of UTF8 strings.

Darin, I'm not sure about this, but don't your nsIURI related changes also
depend on a nsUTF8String implementation?  If you do depend on the nsUTF8String
implementation, then that is another reason why this is needed for Mach V
because your changes have also been checked in to the 0.9.9 milestone.

CCing jband, who will probably argue much more eloquently about why this should
be allowed into Mach V and Mozilla 1.0.
(Reporter)

Updated

16 years ago
Blocks: 129613

Comment 4

16 years ago
nisheeth: right, the URL changes I made would need to be fixed-up once the
string code provides UTF8 string class(es).

required abstract classes:

nsAUTF8String
nsASingleFragmentUTF8String
nsAFlatUTF8String

and we'd of course need an impl "nsUTF8String"... but unless we want to penalize
performance, we need a stack based version of this.

and the classes would need to be able to act like nsACString, etc. in some cases
to avoid extra string copies.  and then there's the conversion classes (e.g.,
NS_ConvertUTF8toUCS2)... we need to deal w/ those as well.

Comment 5

16 years ago
Does that mean that internationalized URLs (and strings passed around
internally) are broken?  What is the end user impact?  Thanks.

Comment 6

16 years ago
trudelle:

the impact of this bug isn't really measurable.  however, clearly without strong
type safety, there may be many nasty i18n bugs lurking around.  most would agree
that such bugs are very likely given the way we currently abuse |char*| and
|nsCString|.

post mozilla 1.0 it will be next to impossible to add these UTF-8 string
classes.  so IMO, we really shouldn't miss this opportunity to get help from the
compiler to prevent mismatched string encodings and related i18n bugs.
embed triage: topembed-, not critical for immediate embedding needs. Adding
embed keyword since this impacts the usage of the embedding API's.
Keywords: topembed → embed, topembed-
Cc'ing valeski to appeal the topembed-.  Marking mozilla1.0+.

/be
Keywords: mozilla1.0+
Marking topembed+.
Keywords: embed, topembed- → topembed+

Comment 10

16 years ago
nsbeta1+ per Nav triage team, adt3
Keywords: nsbeta1 → nsbeta1+
Whiteboard: [need info] → [need info][adt3]

Updated

16 years ago
Whiteboard: [need info][adt3] → [adt3]

Comment 11

16 years ago
Mass moving nsbeta1+/adt3 bugs assigned to Navigator team engineers out to
target milestone 1.0.1.  Please confine your attentions to driving down our list
of TM 1.0 bugs for beta.  Better to help, debug, or test one of them than fix
one of these.
Target Milestone: mozilla1.0 → mozilla1.0.1

Comment 12

16 years ago
Changing nsbeta1+ [adt3] bugs to nsbeta1- on behalf of the adt.  If you have any
questions about this, please email adt@netscape.com.  You can search for
"changing adt3 bugs" to quickly find and delete these bug mails.
Keywords: nsbeta1-

Comment 13

16 years ago
Changing nsbeta1+ [adt3] bugs to nsbeta1- on behalf of the adt.  If you have any
questions about this, please email adt@netscape.com.  You can search for
"changing adt3 bugs" to quickly find and delete these bug mails.
Keywords: nsbeta1+

Updated

16 years ago
Blocks: 125389
(Assignee)

Comment 14

16 years ago
-> future
Target Milestone: mozilla1.0.1 → Future

Comment 15

15 years ago
MOving KW to embed as no immediate customer need (per edt)
Keywords: topembed+ → embed
As a small step in this direction, could we typedef nsAUTF8String as nsACString
so that C++ code can use it for argument types?  That should make it clearer
what functions are expecting as input, at least...

Comment 17

14 years ago
> As a small step in this direction, could we typedef nsAUTF8String as nsACString

I agree 100%... patches welcome ;-)
So what would we need exactly?  Changes to nsStringAPI, nsStringFwd, and
nsrootidl, and a change to xpidl to generate nsAUTF8String headers for AUTF8String?
QA Contact: jag → string

Updated

9 years ago
Status: NEW → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → WONTFIX
I still think we should fix this....  We've had numerous bugs where people stuffed UTF-8 into non-UTF-8 strings and vice versa because on the C++ API level you can't tell them apart.

Comment 20

9 years ago
I don't think we should fix this unless we have a scheme to actually be completely correct and then use compilers and static analysis to enforce certain invariants. I don't think the reward of that is sufficient to actually devote the necessary effort to this bug.
(In reply to comment #20)
> I don't think we should fix this unless we have a scheme to actually be
> completely correct

Isn't this an instance of allowing the perfect to be the enemy of the good?

Comment 22

9 years ago
I'm saying that partial utf8string classes wouldn't be good, just confusing.
That seems like it would depend heavily on what was implemented and what wasn't.
(In reply to comment #23)
> That seems like it would depend heavily on what was implemented and what
> wasn't.

Examples? I find Benjamin's scenario -- developers thinking that they're getting some protection from using nsAUTF8String, but not actually getting it because of an infelicity underneath them -- plausible, but I would find examples of where we would use the classes pretty interesting too, since you seem to have some such in mind.
You need to log in before you can comment on or make changes to this bug.