Closed Bug 206379 Opened 21 years ago Closed 21 years ago

Stop using nsIAtom in nsICharsetConverterManager

Categories

(Core :: Internationalization, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: alecf, Assigned: alecf)

References

Details

(Keywords: memory-footprint)

Attachments

(6 files, 7 obsolete files)

360.30 KB, patch
smontagu
: review+
sfraser_bugs
: superreview+
Details | Diff | Splinter Review
3.24 KB, patch
Details | Diff | Splinter Review
2.34 KB, patch
Details | Diff | Splinter Review
2.86 KB, patch
Details | Diff | Splinter Review
7.55 KB, patch
Details | Diff | Splinter Review
958 bytes, patch
alecf
: review+
bryner
: superreview+
Details | Diff | Splinter Review
I have decided to finally take on the enormous task of getting rid of the use of
atoms from nsICharsetConverterManager.

as a part of this, I am also eliminating the requirement to call
GetCharsetAtom() - it is rediculous that every call to GetEncoder/GetDecoder has
to be prefixed by a call to GetCharsetAtom() in order to do alias resolution.
instead, GetEncoder/GetDecoder should just automatically do alias resolution.

As a part of this, I'm making sure that the argument to GetEncoder/GetDecoder
takes a IDL "string" - a const char* since charset names are by definition ASCII... 

the ripple effects of these changes are pretty extensive but 90% of the code
this affects is simplified due to reduced string conversion, reduced alias
resolution, and so forth. Overall this should result in some much simpler
codepaths and a generally smaller footprint.
a few other changes:
- removing the old nsICharsetConverterManager, and renaming
nsICharsetConverterManager2 over to nsICharsetConverterManager.
- within intl, this resulted in interface changes in nsIPlatformCharset,
nsIConverterInputStream, nsICharsetAlias, nsIScriptableUConv, and nsIPosixLocale
- all mostly simplification
*whew* - I finally finished this. its going to be a beast!

Well, I'm almost finished - I've done the work on linux and windows.. next comes
mac and then we gotta notify the ports due to impacts in gfx/widget

ok, here's the complete patch. Whew.
Comment on attachment 124590 [details] [diff] [review]
remove nsIAtom from most of the API

Requesting reviews from Simon and Blizzard. This is a huge patch I know but it
reallys simplifies some code and thus is a great footprint win.

for my own planning I'd REALLY like to get some ETAs on this.. if you're
super-busy, please suggest an alternate reviewer - blizzard, I don't know who
else does i18n super-reviews?

In my next comment, I'll explain all aspects of the bug.
Attachment #124590 - Flags: superreview?(blizzard)
Attachment #124590 - Flags: review?(smontagu)
ok, here's what's going on here... for the reviewers:

1) I got rid of nsICharsetConverterManager2 and rolled it into
nsICharsetConverterManager, so there is now a single interface for this stuff,
and it is in IDL.

2)
I got rid of all this silly atom stuff. the atoms were really just being used as
a cheesy way to force people to go through nsICharsetAlias. In the old world,
you used nsICharsetConverterManager2::GetCharsetAtom() to convert an charset
string like "x-sjis" to its equivalent, like "Shift_JIS" and then THAT string
would be converted to an atom, that you'd pass to
GetUnicodeEncoder/GetUnicodeDecoder. 

This was just silly. There is really no value in using atoms as the charset
strings were not used in any hashtable, and the atoms were often short-lived. it
meant creating this temporary object on the heap (and storing it in the atom
table) using it once, and then usually destroying it immediately afterwards.

Now, GetUnicodeEncoder/Decoder goes through nsICharsetAlias every time - there
was not a single consumer that wouldn't have wanted to go through the alias
resolver.

Since I didn't need Atoms anymore, I switched over to just using strings. For
absolute simplicity, I used the IDL type "string" because these strings are
usually VERY short (less than 16 characters) and ALWAYS ASCII. By definition,
charset names cannot be encoded in any other charset. The advantages of using
nsACString& were outweighed by the overhead of creating an nsDependentCString. I
wrestled with this decision quite a bit, but in the end I think "string" is
really the right thing here.

3) This also meant simplifying other methods in nsICharsetConverterManager to
take ASCII strings rather than atoms. The one exception was
getCharsetLangGroup() - this should eventually be changed to use raw strings,
but that method doesn't really touch the code I was changing. I'd rather do that
in a seperate patch (this one is big enough!)

4) Some of this conversion had a rippling effect through the i18n APIs - which
included switching lots of i18n APIs over to using nsACString& or "string" - I
tried to simplify where it made sense, and avoid unnecessarily changing excess
code. I hope the reviewers will bear with me when they see calls to
NS_LossyConvertUCS2toASCII().

5) And of course, there are all the changes to the rest of the tree. In some
cases it was as simple as changing the #include's from
nsICharsetConverterManager2.h to nsICharsetConverterManager.h, and in other
cases it involved reworking of existing APIs elsewhere in the tree. I almost
every case it resulted in simpler code, and almost always faster because of less
UCS2/ASCII/UTF8/etc conversion of charset names.

6) Finally, I revved the IID's in all the interfaces I changed, except for
mailnews - revving IIDs was a huge pain and I really ran out of steam by the
time I got to mailnews. My guess is that there aren't really any
plugins/add-on's that use the mail APIs, or at least the ones I changed, which
were more low level (nsIMessenger, nsIMailNewsUrl, etc)

*whew*

Status: NEW → ASSIGNED
Comment on attachment 124590 [details] [diff] [review]
remove nsIAtom from most of the API

I'm just remembering that simon (fraser) also does reviews, so I'm asking him
as well.

Oh two other things:
- for anyone reading this bug or wants to help out, I can certainly use the
review - even the nitpicking reviews will lighten the load for simon and simon.
- we'll need ports work done here - you just need to apply the patch and then
start building. I'll try to spread the word.
Attachment #124590 - Flags: superreview?(blizzard) → superreview?(sfraser)
Alec, the code in nsCSSLoader no longer needs to resolve charset aliases with
this change, does it?
Blocks: 161814
I tried building this on linux w/gtk2 and i'm seeing some bustage:

/home/dwitte/builds/scratch/mozilla/widget/src/gtk2/nsClipboard.cpp: In
   function `void ConvertHTMLtoUCS2(guchar*, int, PRUnichar**, PRInt32&)':
/home/dwitte/builds/scratch/mozilla/widget/src/gtk2/nsClipboard.cpp:616: error: no
   matching function for call to `nsDerivedSafe<nsICharsetConverterManager>::
   GetUnicodeDecoder(nsAutoString*, nsGetterAddRefs<nsIUnicodeDecoder>)'
../../../dist/include/uconv/nsICharsetConverterManager.h:75: error: candidates
   are: virtual nsresult nsICharsetConverterManager::GetUnicodeDecoder(const
   char*, nsIUnicodeDecoder**)
gmake[5]: *** [nsClipboard.o] Error 1

throwing in an NS_LossyConvertUCS2toASCII().get() fixed that problem, and then:

/home/dwitte/builds/scratch/mozilla/xpfe/components/search/src/nsInternetSearchService.cpp:
In
   member function `const char* const
   InternetSearchDataSource::MapScriptCodeToCharsetName(unsigned int)':
/home/dwitte/builds/scratch/mozilla/xpfe/components/search/src/nsInternetSearchService.cpp:3684:
error: brace-enclosed
   initializer used to initialize `const char* const'
[repeated a bunch of times, same line#, for all variables in that initializer block]

I gave up at that point - ping me sometime if you'd like me to do another
build/smoketest run (or, if you can't reproduce, I can send you my .mozconfig).
Comment on attachment 124590 [details] [diff] [review]
remove nsIAtom from most of the API

Firstly, thanks a lot for fixing all these cases. It must have been really
tedious.

>  there was not a single consumer that wouldn't have wanted to go through 
> the alias resolver.

 It's strange that you reached that conclusion after fixing all this.
 At the beginning, I thought similarly (although I suspected that there might
 be a few exceptions).	After going through the patch, I concluded that  
 what I had thought  of as an exception (no need to resolve aliases) is 
 actually the norm (at least for |GetUnicodeEncoder|.). Given this, 
 I think we're better off  with, in nsICharsetConverterManager, 

  nsIUnicode(En|De)coder  GetUnicode(En|De)coder(in string, in bool)

  than with 

  nsIUnicode(En|De)coder  GetUnicode(En|De)coder(in string)


>Index: extensions/spellcheck/src/nsSpellCheckUtils.cpp
....
>@@ -112,7 +112,7 @@
> 
>   nsresult rv;
> 
>-  nsCOMPtr <nsICharsetConverterManager2> ccm2 = do_GetService(NS_CHARSETCONVERTERMANAGER_CONTRACTID, &rv);
>+  nsCOMPtr <nsICharsetConverterManager> ccm2 = do_GetService(NS_CHARSETCONVERTERMANAGER_CONTRACTID, &rv);

  Nit. Why don't you just drop '2' in ccm2? There are several other cases like 
this.

>   NS_ENSURE_SUCCESS(rv, rv);
> 
>   nsCOMPtr <nsIAtom> charsetAtom;

 Here GetUnicode(En|De)coders are still called the old way with charsetAtom.

>Index: extensions/xmlextras/base/src/nsXMLHttpRequest.cpp
>   nsCOMPtr<nsIUnicodeDecoder> decoder;
>-  rv = ccm->GetUnicodeDecoder(&dataCharset,getter_AddRefs(decoder));
>+  rv = ccm->GetUnicodeDecoder(NS_LossyConvertUCS2toASCII(dataCharset).get(),
>+                              getter_AddRefs(decoder));

How about keeping GetUnicode(En|De)coder(nsAString &,...) so that we don't have
to call NS_LossyConvertUCS2toASCII(...).get() on many occasions like the 
above?

>Index: intl/locale/src/unix/nsCollationUnix.cpp
>@@ -165,7 +161,7 @@
>       PRUnichar* mappedCharset = NULL;
>       res = platformCharset->GetDefaultCharsetForLocale(aLocale.get(), &mappedCharset);
>       if (NS_SUCCEEDED(res) && mappedCharset) {
>-        mCollation->SetCharset(mappedCharset);
>+        mCollation->SetCharset(NS_LossyConvertUCS2toASCII(mappedCharset).get());
>         nsMemory::Free(mappedCharset);

  Wouldn't it be better to use nsXPIDLString for mappedCharset 
(in nsCollationXXX.cpp)?  While we're at it, we may as well want to change 
|GetDefaultCharsetForLocale| to take 'char **' instead of 'PRUnichar **' as 
the 2nd argument?  It's only invoked in half dozen places (nsCollationXXX.cpp
and nsDateTimeFormatXXX.cpp) and defined  in nsXXXCharset.cpp 
where |MapToCharset| and |GetDefaultCharsetForLocale| are redefined anyway...

>Index: intl/uconv/src/nsPlatformCharset.h
>+  nsresult MapToCharset(short script, short region, nsACString& outCharset); 
>+  nsresult MapToCharset(nsAString& inANSICodePage, nsACString& outCharset);

>Index: intl/uconv/idl/nsICharsetConverterManager.idl
>+%{ C++
>+#include "nsIUnicodeDecoder.h"
>+#include "nsIUnicodeEncoder.h"
>+#include "nsString.h"
>+
>+
>+// XXX change to NS_CHARSETCONVERTERMANAGER_CID
>+#define NS_ICHARSETCONVERTERMANAGER_CID \
>+  {0x3c1c0163, 0x9bd0, 0x11d3, { 0x9d, 0x9, 0x0, 0x50, 0x4, 0x0, 0x7, 0xb2}}
>+
>+// XXX change to NS_CHARSETCONVERTERMANAGER_PID
>+#define NS_CHARSETCONVERTERMANAGER_CONTRACTID "@mozilla.org/charset-converter-manager;1"
>+%}

 Do you care to make a new file nsUConvCID.h, put two defines there and include
 it wherever nsICharsetConverterManager.h is included? Sorry it's more work, 
 but you're touching all files including 'nsICharsetConverterManage*.h' 
 anyway...  Then, my patch for bug 162765 (bug 162765 comment #32) can be a 
 bit smaller :-)


>Index: intl/uconv/public/MANIFEST

 Isn't this only for Mac build with Codewarrior(?) ? Now that we switched
 over to gmake-based build on Mac, I guess we don't have to worry about this
 any more.

>Index: content/html/style/src/nsCSSLoader.cpp
>+static nsresult ResolveCharset(const nsACString& aCharsetAlias,
>+                               nsACString& aCharset)

bz>the code in nsCSSLoader no longer needs to resolve charset aliases with
bz>this change, does it?

 I guess it depends on what we want/need to store in |mCharset|, the 
canonical resolved name or 'raw value'?  


>Index: gfx/src/freetype/nsFreeType.cpp
>+    nsresult res;
>+    res = charSetManager->GetUnicodeEncoder(fei->mConverterName, &fei->mConverter);

Under gfx/src, there are a few places where skipping the alias 
resolution is desirable because 'mConverterName' (charset) is guaranteed 
(charsets are not coming from ext. source but are hard-coded in 
nsFontMetrics(Xlib|Gtk)  or properties files
to match one of font encoding names (e.g. sun.unicode.india-0)	and going 
through the alias resolution (that is not necessary) might have some perf. 
implication. Would it be a bad idea to add an optional third argument to 
|GetUnicodeEncoder| indicate the alias resolution has to be skipped? 
|GetUnicodeDecoder| does not need this facility.  Ah.... I realized that
we can't have an optional argument in idl (can we?).. Do we want a second
method (GetUnicodeEncoderNoResolution) ? 

>Index: gfx/src/gtk/nsFontMetricsGTK.cpp
>+  // used NS_NewAtom before??
>+  nsIUnicodeEncoder* converter = nsnull;
>+  res = gCharSetManager->GetUnicodeEncoder(aSelf->mCharSet, &converter);
>+  if (NS_SUCCEEDED(res)) {

  This is an example of what I wrote about above. nsFontMetricsGTK.cpp
is skipping the alias resolution and invokes GetUnicodeEncoder(string,...)
directly because there's no need to resolve charset aliases.


>-  nsCOMPtr<nsIAtom> charset;
>-  nsresult res = gCharSetManager->GetCharsetAtom2(aCharSetInfo->mCharSet,
>-                                             getter_AddRefs(charset));
>-  if (NS_SUCCEEDED(res)) {
>-    res = gCharSetManager->GetCharsetLangGroup(charset,
>+  nsresult res;
>+  
>+  res = gCharSetManager->GetCharsetLangGroup(aCharSetInfo->mCharSet,
>                                              &aCharSetInfo->mLangGroup);

  What I wrote about GetUnicodeEncoder applies to GetCharsetLangGroup
  as well.


>Index: gfx/src/mac/nsMacUnicodeFontInfo.cpp
> // This function uses the charset converter manager (CCM) to get a pointer on 
>@@ -543,22 +547,18 @@
>-  nsCOMPtr<nsIAtom> charset;
>-  rv = gCharsetManager->GetCharsetAtom(value.get(), getter_AddRefs(charset));
>-  if (NS_FAILED(rv)) return rv;
>-
>-  rv = gCharsetManager->GetUnicodeEncoder(charset, aConverter);
>+  rv = gCharsetManager->GetUnicodeEncoder(value.get(), aConverter);

  Here again, we don't want/need the charset alias resolution. We're getting
charset names from maccharset.properties that are already in the canonical
form. 

>Index: gfx/src/ps/nsPostScriptObj.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/gfx/src/ps/nsPostScriptObj.cpp,v
>@@ -3076,21 +3076,11 @@

>     if (psnativecode) {
....
>+      nsCOMPtr<nsICharsetConverterManager> ccMain =
>+        do_GetService(kCharsetConverterManagerCID, &res);
>+      
>+      if (NS_SUCCEEDED(res)) {
>+        res = ccMain->GetUnicodeEncoder(psnativecode.get(), &linfo->mEncoder);

  I have to check, but it seems like we could have avoided the alias
  resolution because 'psnativecode' also comes from one of properties/pref.
  file as long as end-users can be 'disciplined' to use the canonical 
  charset names only when they edit them. If that's the case, this is another
  case where we don't want/need the alias resolution.


>Index: gfx/src/windows/nsFontMetricsWin.cpp
> // Helper to determine if a font has a private encoding that we know something about
> static nsresult
>-GetEncoding(const char* aFontName, nsString& aValue)
>+GetEncoding(const char* aFontName, nsCString& aValue)

>+  if (gFontEncodingProperties) {
>@@ -1299,13 +1302,11 @@

>+GetConverterCommon(const char* aEncoding, nsIUnicodeEncoder** aConverter)
> {
>-  if (NS_FAILED(rv)) return rv;
>-  rv = gCharsetManager->GetUnicodeEncoder(charset, aConverter);
>+  nsresult rv;
>+  rv = gCharsetManager->GetUnicodeEncoder(aEncoding, aConverter);

  Yet another case :-). fontEncoding.properties file is supposed to
  use the canonical names only. Actually, as is the case of X11 font
  encoding names, 'charset' names in fontEncoding.properties don't
  have (are NOt supposed to have) any alias in charsetalias.properties.

>-      rv = gCharsetManager->GetUnicodeEncoder(charset, &gUserDefinedConverter);
>+      rv = gCharsetManager->GetUnicodeEncoder("x-user-defined", &gUserDefinedConverter);

  Another case :-) The old code should not have  used 'charsetAtom' but instead
  should have directly invoked GetUnicodeEncoder("x-user-defined",....)
  as in the following code in nsFontMetricsXlib.cpp

>Index: gfx/src/xlib/nsFontMetricsXlib.cpp
>-        res = mFontMetricsContext->mCharSetManager->GetUnicodeEncoder(charset, &ud_conv);
>+        res = mFontMetricsContext->mCharSetManager->GetUnicodeEncoder("x-user-defined", &ud_conv);

>@@ -2539,17 +2530,15 @@
>+        nsresult res;
>+        // used to use NS_NewAtom??
>         nsCOMPtr<nsIUnicodeEncoder> converter;
>-        res = aFmctx->mCharSetManager->GetUnicodeEncoder(charset,
>+        res = aFmctx->mCharSetManager->GetUnicodeEncoder(aEntry->mInfo->mCharSet,

  the same case as in nsFontMetfics(GTK|Win).cpp, nsFreetype?.cpp,
  and my patch for bug 176290...

>@@ -2580,64 +2569,55 @@
>+  nsIUnicodeEncoder* converter = nsnull;
>+  res = aFmctx->mCharSetManager->GetUnicodeEncoder(aSelf->mCharSet, &converter);

  ditto...

>@@ -4040,18 +4020,15 @@
>+  res = aFmctx->mCharSetManager->GetCharsetLangGroup(aCharSetInfo->mCharSet,
>+                                                     &aCharSetInfo->mLangGroup);

  again, we don't need the alias resolution...

>Index: htmlparser/src/nsScannner.cpp
>@@ -204,22 +204,22 @@
> 
>   nsCOMPtr<nsICharsetAlias> calias(do_GetService(kCharsetAliasCID, &res));
>   NS_ASSERTION( nsnull != calias, "cannot find charset alias");
>-  nsAutoString charsetName; charsetName.Assign(aCharset);
>+  nsCAutoString charsetName = NS_LossyConvertUCS2toASCII(aCharset);
....
>+    res = calias->GetPreferred(charsetName, charsetName);

>@@ -229,7 +229,7 @@
...
>-      res = ccm->GetUnicodeDecoder(&mCharset, &decoder);
>+      res = ccm->GetUnicodeDecoder(mCharset.get(), &decoder);

  mCharset is already a canonical name... so even for GetUnicodeDecoder,
  we sometimes want to skip the alias resolution.

>Index: layout/html/forms/src/nsIsIndexFrame.cpp
>@@ -520,7 +522,7 @@
>@@ -530,7 +532,7 @@
>                                     NS_GET_IID(nsICharsetConverterManager),
>                                     (nsISupports**)&ccm);
>   if(NS_SUCCEEDED(rv) && (nsnull != ccm)) {
>-     rv = ccm->GetUnicodeEncoder(&charset, encoder);
>+     rv = ccm->GetUnicodeEncoder(charset.get(), encoder);

 This is a bit murky case. The current code doesn't go via charsetAtom 
 (i.e. no alias resolution) and  that's probably all right provided that 
 |GetSubmitCharset| returns the canonical name.  


>Index: widget/src/gtk/nsClipboard.cpp
>     nsCOMPtr<nsICharsetConverterManager> ccm = do_GetService(NS_CHARSETCONVERTERMANAGER_CONTRACTID, &rv);
>-    rv = ccm->GetUnicodeDecoder(&platformCharset, getter_AddRefs(decoder));
>+    rv = ccm->GetUnicodeDecoder(platformCharset.get(),
>+                                getter_AddRefs(decoder));

  I'm a bit suspicious of what's going on in nsClipboard.cpp (that's off the
  topic here, though). Anyway, here the
  current code doesn't go via charsetAtom, either. Some charset names used 
  here are hard-coded and others are likely to have been stored in canonical
  names so that going through the alias resolution again could be a waste.

>Index: widget/src/gtk/nsGtkIMEHelper.cpp
>+      manager->GetUnicodeDecoder(charset.get(), &mDecoder);

 again no atom here.

>Index: widget/src/gtk/nsWindow.cpp
>+  rv = ccm->GetUnicodeEncoder(platformCharset.get(), getter_AddRefs(encoder));

 ditto...

>Index: widget/src/mac/nsMacControl.cpp
>+		nsCAutoString fileSystemCharset;
> 		GetFileSystemCharset(fileSystemCharset);
>+			rv = ccm->GetUnicodeEncoder(fileSystemCharset.get(), &mUnicodeEncoder);

  ditto...

>Index: widget/src/windows/nsFilePicker.cpp
>+      rv = ccm->GetUnicodeEncoder(fileSystemCharset.get(), &mUnicodeEncoder);
 same here...

>Index: widget/src/xpwidgets/nsPrimitiveHelpers.cpp
>     nsCOMPtr<nsICharsetConverterManager> ccm = 
>              do_GetService(NS_CHARSETCONVERTERMANAGER_CONTRACTID, &rv);  
>-    rv = ccm->GetUnicodeDecoder(&platformCharset, getter_AddRefs(decoder));
>+    rv = ccm->GetUnicodeDecoder(platformCharset.get(),
>+                                getter_AddRefs(decoder));

  same here...

>Index: xpfe/appshell/src/nsAppShellService.cpp
>-  rv = ccm->GetUnicodeDecoder(&aCharset, getter_AddRefs(decoder));
>+  rv = ccm->GetUnicodeDecoder(aCharset.get(), getter_AddRefs(decoder));

  again..

>Index: xpfe/components/bookmarks/src/nsBookmarksService.cpp
>-                rv = charsetConv->GetUnicodeDecoder(&defaultCharset, getter_AddRefs(mUnicodeDecoder));
>+                rv = charsetConv->GetUnicodeDecoder(defaultCharset.get(),
>+                                                    getter_AddRefs(mUnicodeDecoder));
>     }
>@@ -1069,13 +1072,13 @@
....
>-        nsAutoString    charsetName;
>+        nsCAutoString    charsetName;
>         if (NS_SUCCEEDED(rv = gCharsetAlias->GetPreferred(charset, charsetName)))
>         {
>             if (!charsetName.IsEmpty())
>@@ -1092,7 +1095,7 @@
>             (nsISupports**)&charsetConv);
>     if (NS_SUCCEEDED(rv) && (charsetConv))
>     {
>-        rv = charsetConv->GetUnicodeDecoder(&charset, decoder);
>+        rv = charsetConv->GetUnicodeDecoder(charset.get(), decoder);

  It doesn't look like the resolved charset name is stored anywhere. 
  Why  don't we just let GetUnicodeDecoder do the resolution? 


>Index: xpfe/components/search/src/nsInternetSearchService.cpp
>+InternetSearchDataSource::MapScriptCodeToCharsetName(PRUint32 aScriptCode)
> {
>+        if (aScriptCode < NS_ARRAY_LENGTH(scriptList))
>+          aScriptCode = 0;

  Shouldn't it be 

	  if (aScriptCode >= NS_ARRAY_LENGTH(scriptList))
followup to comment 8: killing the spurious braces fixed the bustage - build
seems to work fine now
jshin: thank you SO much for the review. I know it was a beast. Appearently I
just didn't understand the alias resolution stuff as well as I thought! since
you seem to have found the cases where we don't need the resolution, I will
change those to GetUnicode[En|De]coderRaw() - a version which does not use the
alias resolution. I thought about the extra boolean but I think a seperate
function call is more concise and makes the consumers easier to understand.

As for some of the other stuff, like:
- why we don't switch to nsACString& from char** and so forth - I am trying to
fix the minimal case, which as you can see is pretty huge thus far. Do you mind
if I get those in another patch, a second round of fixes? 
- As for nsUConvCID.h, do you mind if I get that in another patch as well? I
agree that those should be in a seperate file but there are SO many places that
#include nsICharsetConverterManager.h, that it would be very painful to revisit
them all. 

bz: thanks also for the review, it sounds like jshin has covered that case, I'll
use the GetCharsetEncoderRaw()

dwitte: Is there a missing brace in my patch, or is there just missing stuff? I
forgot that gtk2 will be one of the ports that I need porting work on - thanks
for helping! is there any chance you can post a patch with the changes you had
to make, and I'll incorporate it into my work? (I'm assuming the changes will be
in new files that this patch doesn't cover)
No longer blocks: 161814
*** Bug 161814 has been marked as a duplicate of this bug. ***
Attached patch OS/2 patch (obsolete) — Splinter Review
Here's the OS/2 patch
Alec, I didn't read any part of this patch, actually... so no review here.

jshin, the CSSLoader just wants a string it can pass to GetUnicodeDecoder to
come out with a decoder.  It sometimes has strings that do need resolution (eg
coming from HTTP or from the sheet) and sometimes strings that do not need
resolution (coming from nsIDocument).  Right now we resolve aliases by hand in a
bunch of places in that code; I ended up writing a helper function to do it,
because it was such a pain.  So from the CSSLoader point of view, it would be
much preferable to just have intl handle things reasonably and not have to do
all that work.
Comment on attachment 124590 [details] [diff] [review]
remove nsIAtom from most of the API

>Index: intl/locale/src/unix/nsDateTimeFormatUnix.cpp
>===================================================================
>@@ -118,20 +118,20 @@
...
>   if (NS_SUCCEEDED(res)) {
>-    res = charsetConverterManager->GetCharsetAtom(mCharset.get(), getter_AddRefs(charsetAtom));
>     if (NS_SUCCEEDED(res)) {
>-      res = charsetConverterManager->GetUnicodeDecoder(charsetAtom, getter_AddRefs(mDecoder));
>+      res = charsetConverterManager->GetUnicodeDecoder(mCharset.get(), getter_AddRefs(mDecoder));
>     }
>   }
> 

You can lose one of the two |if (NS_SUCCEEDED(res)) {| lines here

>Index: intl/uconv/idl/nsICharsetConverterManager.idl
...
>+ * Here Charsets are indentified by nsIAtom's. I know, we could have our own 
...

This comment block needs updating.
Also, the new files should be MPL-trilicensed not NPL-trilicensed.

This is as far as I've got today, I'll try to finish the rest tomorrow.
Attached patch gtk, gtk2 and bustage patch (obsolete) — Splinter Review
alecf: np... this stuff looks nice :)

as requested, this patch does a couple things...

1) fixes bustage in xpfe/components/nsInternetSearchService.cpp
2) applies your work to the gtk2 port
3) fixes some general string-class usage in gtk & gtk2 (not required for this
work, but I just noticed some silly stuff on a cursory glance over the code,
and felt like fixing it)

if you'd rather we stick to basics, I can post a patch without the misc string
changes.

(you need to apply this patch _on top_ of yours, btw... also, i've
compiletested gtk2, but not the gtk portion which I modified, so your request
for gtk port testing still applies)
cool, thanks for the patch - I'm really surprised to see that bustage in gtk and
nsInternetSearchService, I've gotta say. The original patch still builds on
gtk/osx/win32 for me.. odd (perhaps it isn't applying cleanly now?)

But is there any chance we could do without the string cleanup? I will GLADLY
sr= any string cleanup later, but I want to keep this patch to the absolute
minimum, given its current size...(364k without the os2 or gtk2 patches)
>cool, thanks for the patch - I'm really surprised to see that bustage in gtk and
>nsInternetSearchService, I've gotta say. The original patch still builds on
>gtk/osx/win32 for me.. odd (perhaps it isn't applying cleanly now?)

it applies cleanly with a couple fuzzes... I'm using gcc3.3, maybe it's related
to that?

>But is there any chance we could do without the string cleanup? I will GLADLY
>sr= any string cleanup later, but I want to keep this patch to the absolute
>minimum, given its current size...(364k without the os2 or gtk2 patches)

sure thing, coming right up.
this avoids the string cleanup (except for one spot in
widget/src/gtk/nsClipboard.cpp which WAS related to your changes, where I made
an obvious correction)

as before, this applies on top of your patch.
Attachment #124816 - Attachment is obsolete: true
Alec, all you wrote is reasonable. 

BTW, what are you gonna do with GetCharsetLangGroup()? It's not used as often as
GetUnicode(De|En)coder, but it has the same issue. In nsFontMetrics(GTK|Xlib),
charset names are coming from the hard-coded arrays (nsFontCharSetInfo types)
with canonical names (the current code should have used NS_NewAtom instead of
GetCharsetAtom2()) so that no resolution is necessary. In a few other places
(mailnews, intl/locale), it's coming from external sources and we need to
resolve aliases. 

> jshin, the CSSLoader just wants a string it can pass to 
> GetUnicodeDecoder to come out with a decoder. 

  OK. I see. So, CSSLoader doesn't care whether mCharset is canonical or not.
When mCharset is used to get a UnicodeDecoder for CSSLoader
(nsIUnicharStreamLoader), we just have to make sure that GetCharsetDecoder()
(that does resolve aliases) is invoked. I tracked it down and found that it's 
http://lxr.mozilla.org/seamonkey/source/intl/uconv/src/nsConverterInputStream.cpp#50

At the moment, it's not resolving aliases. Why doesn't it? If it did, 
ResolveCharset() in nsCSSLoader.cpp wouldn't be necessary even without Alec's
patch, would it? 

Anyway, in nsConverterInputStream we have to make sure that GetUnicodeDecoder()
with the alias resolution is called.

In addition, I guess we have to make it clear that charset in
nsIUniCharStreamLoader is not necessarily a canonical name. 
(http://lxr.mozilla.org/seamonkey/source/netwerk/base/public/nsIUnicharStreamLoader.idl#109)
dan, again thanks for helping me out here! This is very helpful
mike, thanks for the OS/2 stuff, I'm incorporating it now
jshin, I'm making a getCharsetLangGroupRaw() to cover the same case

it sounds like the nsCSSLoader stuff is going to be pretty straight forward, and
that I may have possibly fixed a bug by using the alias-resolving version of
GetUnicodeDecoder :) Thank you SO much for going through and figuring out which
cases needed the raw version...that makes my life much easier.

Anyhow, I'm updating the patch to address all of jshin and simon's reviews. 
Attached patch remove nsIAtom v2 (obsolete) — Splinter Review
ok, here's an updated patch that incorporates the os2 and gtk2 patches above..

there is appearently one minor issue with the mail compose window that I will
be investigating.. but I think that will be a .js patch that is a supplement to
this one... (so this patch can be further reviewed without fear that it will be
updated again)
Attachment #124590 - Attachment is obsolete: true
Attachment #124762 - Attachment is obsolete: true
Attachment #124828 - Attachment is obsolete: true
Attachment #124590 - Flags: superreview?(sfraser)
Attachment #124590 - Flags: review?(smontagu)
ok, I could really use some more reviews on attachment 124846 [details] [diff] [review] - especially from
jshin and smontagu. 

The one thing I forgot was to change all the "ccm2" variables over to 'ccm' -
I've done that now in my local tree, but its not a large enough change to
justify a new patch.

Comment on attachment 124846 [details] [diff] [review]
remove nsIAtom v2

looking for a r=smontagu AND a r=jshin... thanks guys!
Attachment #124846 - Flags: review?(smontagu)
Attached patch mail compose window fix (obsolete) — Splinter Review
ok, this fixes all the stuff in MsgComposeCommands.js, which was doing all
sorts of goofy things with charsets and such. This gets applied in addition to
the above patch. I used LXR to search for any other occurences of this stuff,
and this is the only one I found.. go figure!
To nhotta: Can you answer my question about nsTextToSubURI in the following (I'm
sorry it's in the middle.) 

re: attachment 124846 [details] [diff] [review]
Summary :

 1. I ended up checking (mostly) raw vs non-raw.
   (Simon may wish to spend more time on nsCharsetMenu.cpp,
     nsScriptLoader.cpp, nsCharsetConverterManager.cpp)
 2. Some observations:

   a. Whereever a string literal is directly passed to GetUnicodeXXX() or 
      NS_NewAtom("...") is used, it's completly safe to use
      GetUnicodeXXXRaw(). In intl/uconv, gfx/src, mailnews and other places,
      there are many cases of non-raw version of GetUnicodeXXX invoked
      with UTF-8, GB2312, EUC-KR, x-euc-tw, ISO-8859-7, ISO-8859-1, 
      x-user-defined, x-imap4-modified-utf. An example is shown below:

Index: extensions/xmlextras/base/src/nsDOMParser.cpp
+  rv = charsetConv->GetUnicodeEncoder("UTF-8",
                                       getter_AddRefs(encoder));

  A subset of the canonical charset names are listed in 
  http://lxr.mozilla.org/seamonkey/source/intl/uconv/src/charsetalias.properties

  Why a subset? Because XLFD names and other internal charset names 
  don't need the alias resolution and developers are supposed to use 
  correct names for them when hard-coding them in the source code.
  Others are listed in 
  http://lxr.mozilla.org/seamonkey/source/intl/uconv/src/charsetdata.properties

  b. Unless it's obviously a mistake in the old code(as we found
     earlier), we can replace GetUnicode(En|De)code(char *,...) or 
     GetUnicode(En|De)code(NS_NewAtom(char *),...)
     with GetUnicode(En|De)coderRaw(char *,...). In most of cases, charset
     names passed are already canonical because they come from
     platformCharset/filesystem charset/other sources known to provide
     the canonical name. In what follows, I list some cases (but not all).
     Doing this would not introduce a regression although we may be
     preserving bugs as well. 
 
  c. The opposite case of b. Unless it's obvious (I identified several
     in the previous comment and found a couple of more here), 
     we can just use non-raw methods when charsetAtom is used in the old 
     code. This item has null effect on the patch, though. :-)

  d. When the resolved charsetAtom is stored in the old code (in member
     variables), we have to store the resolved charset name. Your patch
     already did that, but there's a case you might  have missed.
     (nsIContentSerializer. see comment further down.)

 3. As far as 'raw vs non-raw' is concerned,
    I just realized that I could have edited the patch directly
    and sent it to you off-line instead of commenting on that. It'd have 
    taken less time ...... Stupid of me....
   
Index: extensions/xmlextras/base/src/nsXMLHttpRequest.cpp
-  rv = ccm->GetUnicodeDecoder(&dataCharset,getter_AddRefs(decoder));
+  rv = ccm->GetUnicodeDecoder(NS_LossyConvertUCS2toASCII(dataCharset).get(),
+                              getter_AddRefs(decoder));

I guess dataCharset is already canonicalized. In intl/chardet and other places,
we return  canonical names to implementations of nsIDocument. At least that's
what's assumed by the current code (item 2b)

Index: intl/locale/src/mac/nsDateTimeFormatMac.cpp
@@ -316,14 +316,12 @@
-    res = charsetConverterManager->GetCharsetAtom(mUseDefaultLocale ?
mSystemCharset.get() : mCharset.get(), 
-                                                  getter_AddRefs(charsetAtom));
-    if (NS_SUCCEEDED(res)) {
-      res = charsetConverterManager->GetUnicodeDecoder(charsetAtom,
getter_AddRefs(mDecoder));
-    }
+    res = charsetConverterManager->GetUnicodeDecoder(mUseDefaultLocale ?
mSystemCharset.get() : 
+                                                     mCharset.get(), 

I'm 99.9% sure that we don't need the alias resolution in 
nsDateTimeFormatXXX.cpp (where XXX is Mac, Win, OS2, UNIX, BeOS, etc) 
because all of them use charset name provided by implementations of 
nsIPlatformCharset (nsXXXCharset.cpp) that should
return canonical names, but the old code used the resolution. If you
want to be safe, we can just use non-raw versions.

In nsPluginHostImpl.cpp and nsPrefMigration.cpp, the alias 
resolution(charsetAtom) is not used for platformCharset in the old code.
There are several other cases in profile/src/nsProfileAccess.cpp and
in widget/src/*. (item 2b)

Index: modules/plugin/base/src/nsPluginHostImpl.cpp
-  rv = ccm->GetUnicodeDecoder(&platformCharset, aUnicodeDecoder);
+  rv = ccm->GetUnicodeDecoder(platformCharset.get(),

Index: profile/pref-migrator/src/nsPrefMigration.cpp
-    rv = ccm->GetUnicodeDecoder(&aCharset, getter_AddRefs(decoder));
+    rv = ccm->GetUnicodeDecoder(aCharset, getter_AddRefs(decoder));
 
@@ -2479,8 +2473,9 @@
...
-  nsAutoString charSet;
+  nsCAutoString charSet;
   rv = GetPlatformCharset(charSet);
   ....
@@ -2504,7 +2499,7 @@
...
 // A wrapper function to call the interface to get a platform file charset.
 nsresult 
-nsPrefConverter::GetPlatformCharset(nsAutoString& aCharset)
+nsPrefConverter::GetPlatformCharset(nsCString& aCharset)
....
    rv = platformCharset->GetCharset(kPlatformCharsetSel_4xPrefsJS, aCharset);

Below is an example case where the old code doesn't use charset alias 
resolution.  Just using raw version wouldn't introduce a new bug here.

Index: layout/html/forms/src/nsIsIndexFrame.cpp
-  nsAutoString charset;
+  nsCAutoString charset;
@@ -530,7 +532,7 @@
-     rv = ccm->GetUnicodeEncoder(&charset, encoder);
+     rv = ccm->GetUnicodeEncoder(charset.get(), encoder);


Index: intl/uconv/src/nsTextToSubURI.cpp

I'm not sure why charsets are not resolved in two cases while it's resolved
via charsetAtom in one case, here.  I'll CC to nhotta. By using non-raw version
where charsetAtom is not used in the current code, we may be fixing a bug
or we may be introducing an unnecessary overhead.

-     rv = ccm->GetUnicodeEncoder(&charsetStr, &encoder);
+     rv = ccm->GetUnicodeEncoder(charset, &encoder);
@@ -139,9 +137,8 @@
-    rv = ccm->GetUnicodeDecoder(&charsetStr, &decoder);
+    rv = ccm->GetUnicodeDecoder(charset, &decoder);
@@ -200,17 +197,13 @@
-  rv = charsetConverterManager->GetUnicodeDecoder(charsetAtom, 
+  rv = charsetConverterManager->GetUnicodeDecoder(aCharset.get(), 

Index: dom/src/base/nsGlobalWindow.cpp
@@ -3487,7 +3487,8 @@
-  result = ccm->GetUnicodeEncoder(&charset, getter_AddRefs(encoder));
+  result = ccm->GetUnicodeEncoder(NS_LossyConvertUCS2toASCII(charset).get(),
@@ -3584,7 +3585,8 @@
-  result = ccm->GetUnicodeDecoder(&charset, getter_AddRefs(decoder));
+  result = ccm->GetUnicodeDecoder(NS_LossyConvertUCS2toASCII(charset).get(),

Here (and in other places) I think we're safe to call Get...Raw (item 2b)
 
Index: intl/uconv/tests/TestUConv.cpp
...
-  // test alias resolving capability
-  nsAutoString csAlias(NS_LITERAL_STRING("iso-10646-ucs-basic"));
-  nsAutoString csName(NS_LITERAL_STRING("UTF-16BE"));
-  res = ccMan->GetCharsetAtom(csAlias.get(), getter_AddRefs(csAtom));
....
-
-  // test self returning if alias was not found
-  nsAutoString csAlias2(NS_LITERAL_STRING("Totally_dummy_charset_name"));

  I guess we need to keep alias resolving test. Now that charsetAtom
is gone, we can use GetPrefered() for testing alias resolution, instead.


Index: intl/uconv/tests/convperf.cpp
@@ -130,8 +130,7 @@
-             res = ccMain->GetUnicodeDecoder(&str, &decoder);
+             res = ccMain->GetUnicodeDecoder(argv[tocodeind], &decoder);
@@ -142,8 +141,7 @@
-             res = ccMain->GetUnicodeEncoder(&str, &encoder);
+             res = ccMain->GetUnicodeEncoder(argv[fromcodeind], &encoder);

I guess convperf.cpp is called multiple times(possibly hundreds of times) 
by a script (that provides the canonical names) for perf. estimate so that 
we have to skip the alias resolution (by using 'raw' version) as is done 
in the old code.


Index: content/base/src/nsHTMLContentSerializer.h
 
   NS_IMETHOD Init(PRUint32 flags, PRUint32 aWrapColumn,
-                  nsIAtom* aCharSet, PRBool aIsCopying);
+                  const char* aCharSet, PRBool aIsCopying);
....
-  nsCOMPtr<nsIAtom> mCharSet;
+  nsCString mCharSet;

One problem with removing charsetAtom is that we lose the ability to
tell whether 'charset' is canonical or not. 

I think it's important to update the documentation of interfaces and related 
methods where  we replace 'nsCOMPtr<nsIAtom> mCharSet' with 'nsCString 
mCharset'.  For instance, nsIContentSerializer
(http://lxr.mozilla.org/seamonkey/source/content/base/public/nsIContentSerializer.h#61)
should make it clear that mCharset(charset attribute) must be canonical
and Init()  must resolve aCharset with GetPrefered() before storing it in 
mCharset. Implementations of nsIContentSerializer have to be updated to 
reflect this. Consumers of 'mCharset' can just call non-raw version of 
GetUnicode(En|De)coder, but in that case we end up resolving aliases every 
time it's called instead of just once in Init().

Alternatively, we can mandate that nsIContentSerializer::Init() be always
called with a resolved charset. 
 
Index: content/html/style/src/nsCSSLoader.cpp
@@ -382,7 +382,7 @@
 static nsresult GetCharsetFromData(const unsigned char* aStyleSheetData,
                                    PRUint32 aDataLength,
-                                   nsAString& aCharset)
+                                   nsACString& aCharset)
......
       result = ResolveCharset(charsetCandidate, charset);
       ......
+  aCharset = charset;

 Why do you keep ResolveCharset()? According to bz,
 nsCSSLoader.cpp doesn't care whether it's resolved or not. Then, we can 
 remove it, can't we?

Index: gfx/src/mac/nsMacUnicodeFontInfo.cpp
-  rv = gCharsetManager->GetCharsetAtom(value.get(), getter_AddRefs(charset));
-  rv = gCharsetManager->GetUnicodeEncoder(charset, aConverter);
+  rv = gCharsetManager->GetUnicodeEncoder(value.get(), aConverter);

  charset names come from gfx/src/mac/fontEncoding.properties file where 
all charsets are canonical.

Index: gfx/src/ps/nsPostScriptObj.cpp
+      if (NS_SUCCEEDED(res)) {
+        res = ccMain->GetUnicodeEncoderRaw(psnativecode.get(), &linfo->mEncoder);

 Sorry I was wrong here. At the moment, the sample user pref. shipped 
 (gfx/src/ps/sample.unixpsfonts.properties) has 'euc-jp' (instead of
 canonical 'EUC-JP'). I'll file a new bug after this patch is landed.
 In the meantime, non-raw version has to be used here.

Index: widget/src/gtk/nsClipboard.cpp
@@ -1420,7 +1417,7 @@

-    rv = ccm->GetUnicodeDecoder(&charset, getter_AddRefs(decoder));
+    rv = ccm->GetUnicodeDecoder(charset.get(), getter_AddRefs(decoder));

Index: widget/src/gtk2/nsClipboard.cpp
-        rv = ccm->GetUnicodeDecoder(&charset, getter_AddRefs(decoder));
+        rv = ccm->GetUnicodeDecoder(charset.get(), getter_AddRefs(decoder));
 
Index: widget/src/os2/nsFilePicker.cpp
     if (NS_SUCCEEDED(rv)) {
-      rv = ccm->GetUnicodeEncoder(&fileSystemCharset, &mUnicodeEncoder);
+      rv = ccm->GetUnicodeEncoder(fileSystemCharset.get(), &mUnicodeEncoder);
 
@@ -553,13 +553,13 @@
..
-      rv = ccm->GetUnicodeDecoder(&fileSystemCharset, &mUnicodeDecoder);
+      rv = ccm->GetUnicodeDecoder(fileSystemCharset.get(), &mUnicodeDecoder);

Index: xpfe/components/bookmarks/src/nsBookmarksService.cpp
-                rv = charsetConv->GetUnicodeDecoder(&defaultCharset,
getter_AddRefs(mUnicodeDecoder));
+                rv = charsetConv->GetUnicodeDecoder(defaultCharset.get(),

Index: mailnews/import/src/nsImportService.cpp 

-			rv = ccm2->GetUnicodeDecoder(charsetAtom, &m_pDecoder);
+			rv = ccm2->GetUnicodeDecoder(m_sysCharset.get(), &m_pDecoder);

I think we can use ...Raw here, but can leave it alone to be safe. (item 2c in
the summary)
Attached patch remove nsIAtom v2.1 (obsolete) — Splinter Review
ok, here's a cumulative patch which includes all changes so far, as well as
jshin's patch (Sent over e-mail) which changed more consumers over to the
get*Raw() versions, and includes the MsgComposeCommands.js that brings the
compose window back from the dead.
Attachment #124846 - Attachment is obsolete: true
Attachment #124870 - Attachment is obsolete: true
Comment on attachment 124938 [details] [diff] [review]
remove nsIAtom v2.1

ok, again looking for an r=smontagu, r=jshin

It sounds like people want to know what's going on with nsCharsetMenu.cpp too..
I'll explain that in the next comment.
Attachment #124938 - Flags: review?(smontagu)
Index: mailnews/base/util/nsMsgUtf7Utils.cpp
.....
-      res = ccm->GetUnicodeEncoder(&aCharset, &encoder);
+      res = ccm->GetUnicodeEncoder("x-imap4-modified-utf7", &encoder);

  It seems like my patch for 'Raw-love' didn't get applied :-)

TestUConv.cpp still doesn't test the alias resolution with GetPrefered().  And
how about removing ResolveCharset() in nsCSSLoader.cpp?     
what the heck.. I thought it went on ok. I'll reapply it.. good thing you
double-checked!
what the heck.. I thought it went on ok. I'll reapply it.. good thing you
double-checked!
ok, one more try here. this includes
- update against the trunk to catch the nsCharsetMenu.cpp changes
- jshin's patch that he sent to me to use more Raw() methods
- MsgComposeCommands.js changes
- change from ccm2 to ccm in a bunch of places.
Attachment #124938 - Attachment is obsolete: true
alecf, thank you for taking care of this beast. r=jshin
as we talked off-line, I'll hunt for more spots for 'Raw' after this gets landed.
more bustage... sorry alecf :/

this fixes firebird's forked nsBookmarksService.cpp. I did this by blindly
applying the changes to mozilla's version, so please double-check I did the
right thing.

also removes a spurious #include I noticed (you had two of the exact same
#include in that file)

thx for fixing message compose, it works now. :)
So the converter input stream calls GetUnicodeDecoder, which now does alias
resolution?  Again, could we remove all the code in nsCSSLoader that is not
needed as a result?
bz,  let's lift this behemoth patch off alecf's shoulder :-) While 
bugzilla was down (6 - 7 hrs ago), I told alecf that I'd take care of
nsCSSLoader.cpp and TestUconv.cpp after the current patch gets landed. Sorry I
forgot to mention that in my prev. comment.
Comment on attachment 124846 [details] [diff] [review]
remove nsIAtom v2

(cancelling review requests on old patches)
Attachment #124846 - Flags: review?(smontagu)
Comment on attachment 124994 [details] [diff] [review]
fix firebird bustage from v2.11 patch

sr=alecf on this. Looks like you did the right thing.
Attachment #124994 - Flags: superreview+
Comment on attachment 124982 [details] [diff] [review]
remove nsIAtom v2.11

oops, for got to request a new review from smontagu.

Simon, can you review this? it already has r=jshin and he's found lots of good
stuff, but I'd like a second set of i18n eyes on this just to make 100% sure.

I'd really like to land this soon, please find me on IRC if you have any
questions...
Attachment #124982 - Flags: review?(smontagu)
Comment on attachment 124982 [details] [diff] [review]
remove nsIAtom v2.11

I've been right through this at last and it looks OK, except what I said before
about the comments in intl/uconv/idl/nsICharsetConverterManager.idl which you
have left unchanged so they are now not true:
+ * Here Charsets are indentified by nsIAtom's. I know, we could have our own 
+ * interface for charsets (something like nsICharacterSet). But for now, I 
+ * will attempt to use Atom's.

r=smontagu
Attachment #124982 - Flags: review?(smontagu) → review+
Comment on attachment 124982 [details] [diff] [review]
remove nsIAtom v2.11

oops! yeah, I'll fix the MPL-trilicense stuff and the comments before landing.

Simon (fraser) - I have r=jshin and r=smontagu, both strong i18n reviewers, and
we've gone a few rounds finding the right patch. do you mind giving this a
quick once-over?
Attachment #124982 - Flags: superreview?(sfraser)
Comment on attachment 124982 [details] [diff] [review]
remove nsIAtom v2.11

rs=sfraser
Attachment #124982 - Flags: superreview?(sfraser) → superreview+
There are a couple of spots missed in intl/ctl/src. The diff for
intl/ctl/src/nsUnicodeToTIS620.cpp in attachment 124982 [details] [diff] [review] has to be replaced with
this (with diffs for two files in intl/ctl/src).
ok! this has landed, thanks reviewers.. looks like we got a minor codesize
improvement (~5k on most platforms)

I think I see minor Ts/Tp performance improvements too, though it may be too
early to tell.
Comment on attachment 124938 [details] [diff] [review]
remove nsIAtom v2.1

removing obsolete review request
Attachment #124938 - Flags: review?(smontagu)
This checkin have added a "may be used uninitialized" warning to brad Tbox:

+gfx/src/gtk/nsFontMetricsGTK.cpp:1372
+ `nsresult res' might be used uninitialized in this function

Indeed, the code in the nsFontMetricsGTK::Init has form:

nsresult res;
...
if (!gInitialized) {
   res = ...
   ...
}
...
if (mLangGroup) {
   ...
   res = ...
   ...
}

if (mLangGroup.get() == gUserDefined) {
   res = ...
   ...
}
else {
   return res
}

If none of the "if" conditions are true, the function will return an
uninitialized nsresult.

http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/gfx/src/gtk/nsFontMetricsGTK.cpp&rev=&cvsroot=/cvsroot&rev=1.255&mark=1410,1435,1454-1456#1410

P.S. Should this function be even calling mLangGroup.get() when it is not even
sure mLangGroup is non-NULL???
Is anyone looking at the "speedracer" tinderbox bustage
(http://tinderbox.mozilla.org/showbuilds.cgi?tree=SeaMonkey-Ports) ?
Line 361 of nsCharsetConverterManager.cpp appears to declare an unused variable.
I don't see the warning/unused variable in nsCharsetConverterManager.cpp?
(though I fixed the one in gfx)
I'd like to submit this patch to the mozdev spell checker folks to fix the
spell checker if Alec says my fix is correct.
Comment on attachment 125527 [details] [diff] [review]
patch for the mozdev spell checker

you are correct, sir... though personally I'd change "mCharset" to be an
nsCString rather than an nsString, then you can avoid all temporaries and 
conversions...

(and if you do still have conversions, take a look at bug 209220 for the next
set of conversions - that one fixes APIs on basically all of
gecko/docshell/editor/etc...)
oops, I should mark this fixed now, huh.
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
> oops, I should mark this fixed now, huh.

Should I file a new bug for comment #46, then?
Comment on attachment 124982 [details] [diff] [review]
remove nsIAtom v2.11

>Index: xpfe/bootstrap/nsAppRunner.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/xpfe/bootstrap/nsAppRunner.cpp,v
>retrieving revision 1.398
>diff -u -r1.398 nsAppRunner.cpp
>--- xpfe/bootstrap/nsAppRunner.cpp	22 Apr 2003 00:20:24 -0000	1.398
>+++ xpfe/bootstrap/nsAppRunner.cpp	5 Jun 2003 07:40:57 -0000
>@@ -1549,8 +1549,11 @@
>   if (HandleDumpArguments(argc, argv))
>     return 0;
> 
>+  printf("Checking for trace malloc..\n");
> #ifdef NS_TRACE_MALLOC
>+  printf("Trace malloc: %d args\n", argc);
>   argc = NS_TraceMallocStartupArgs(argc, argv);
>+  printf("Trace malloc: now have %d args\n", argc);
> #endif
> 
> #if defined(MOZ_WIDGET_GTK) || defined(MOZ_WIDGET_GTK2)

I'm guessing you didn't mean to check this bit in, especially since the first
printf happens in a non-debug build.  I just undid this change.

Also, there were some atom leaks that showed up on tinderbox as a result of
this change.
dbaron: thanks for cleaning up after me - I had the printfs removed in my tree
but I was on vacation until today

I have a fix for the atom leaks, that I'll attach here soon.
This is an additional patch for nsCSSLoader.cpp mentioned in comment #35 (and a
couple of test programs in intl/uconv/test). Let me know if this had better be
dealt with in a new bug.
Comment on attachment 125746 [details] [diff] [review]
extra patch (mentioned in comment #35)

r+sr=me on the nsCSSLoader changes.  I don't know enough about the test app to
review those changes, but you should probably be checking the return values of
all the calls in addition to the strings they return...
Comment on attachment 125746 [details] [diff] [review]
extra patch (mentioned in comment #35)

sr=alecf - this bug is fine, other cleanup work is going here too
Attachment #125746 - Flags: superreview+
Blocks: 207152
Thanks for r/sr. 
nsCSSLoader.cpp got checked in. 
I added the return value checking to TestUConv.cpp, but haven't landed it yet
because I thought it might as well go with a few other extra changes I'm
planning to do (cases in which  I was '98%' sure we could use 'Raw' version, but
left out to be safe)
Sorry, I mistyped my previous comment - the declaration of the unused variable
nsAutoString pref; is on line 381 of nsCharsetConverterManager.cpp, not 361.
Comment on attachment 126724 [details] [diff] [review]
fix atom leaks

argh, sorry I had this in my tree r/sr=alecf
Attachment #126724 - Flags: review?(alecf) → review+
Attachment #126724 - Flags: superreview?(bryner) → superreview+
Comment on attachment 126724 [details] [diff] [review]
fix atom leaks

Leak fix checked in to trunk, 2003-06-30 14:50 -0700.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: