Closed Bug 200732 Opened 21 years ago Closed 21 years ago

crash on utf8 locales on HP-UX

Categories

(Core :: Internationalization, defect)

HP
HP-UX
defect
Not set
critical

Tracking

()

RESOLVED FIXED

People

(Reporter: kishan.thomas, Assigned: jshin1987)

References

Details

(Keywords: crash, fixed1.4.1)

Attachments

(2 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (X11; U; HP-UX 9000/785; en-US; rv:1.2.1) Gecko/20030318
Build Identifier: Mozilla/5.0 (X11; U; HP-UX 9000/785; en-US; rv:1.2.1) Gecko/20030318


Mozilla 1.2.1 and later crashes on HP-UX when locale is set to one 
of the utf8 locales.  Mozilla 1.0.x come sup fine.

on debugging found the crash happening at nsWindow::SetTitle

/mozilla/widget/src/gtk/nsWindow.cpp:2337

<snip>
  2332    rv = ccm->GetUnicodeEncoder(&platformCharset, getter_AddRefs(encoder));
  2333
  2334    // Estimate out length and allocate the buffer based on a worst-case
estimate, then do
  2335    // the conversion.
  2336    PRInt32 len = (PRInt32)aTitle.Length();
  2337    encoder->GetMaxLength(aTitle.get(), len, &platformLen);
</snip>

The encoder returned by ccm->GetUnicodeEncoder at 2332 is NULL and on trying 
to use that encoder leads to the SIGBUS.

Tracing back found out that at:
mozilla/widget/src/gtk/nsCharsetConverterManager.cpp:182 

<snip>
   180    nsCOMPtr<nsIUnicodeEncoder> encoder;
   181    // Always create an instance since encoders hold state.
   182    encoder = do_CreateInstance(contractid.get(), &res);
</snip>

The encoder creation fails when contractidget() is:
"@mozilla.org/intl/unicode/encoder;1?charset=utf8"


Adding the following code to check if it was an utf8 encoder creation that 
failed and creating an ISO-8859-1 encoder instead creates a valid encoder object 
and avoids the crash at nsWindow::SetTitle



Reproducible: Always

Steps to Reproduce:

With 1.2.1 or later mozilla on HP-UX 

1.export LANg=C.utf8
2./opt/mozilla/mozilla 


will cause a crash.

Actual Results:  
Mozilla crashes and core dumps with SIGBUS.

Expected Results:  
Mozilla should come up fine.

$ mozilla --version
Mozilla/5.0 (X11; U; HP-UX 9000/785; en-US; rv:1.2.1) Gecko/20030318

$ gtk-config --version
1.2.10 

$ uname -a
HP-UX foo-box B.11.00 U 9000/785 2007732051 unlimited-user license
Work around code for avoiding the null encoder.

mozilla/intl/uconv/src/nsCharsetConverterManager.cpp
183a184,192
>   // If failed to create an utf8 encoder default to ISO-8859-1 
>   if (NS_FAILED(res) && aDest->CompareWithConversion("utf8", PR_TRUE) == 0)
>   {
>     nsCAutoString utf8fix_contractid(
>                     NS_LITERAL_CSTRING(NS_UNICODEENCODER_CONTRACTID_BASE) +
>                     NS_LITERAL_CSTRING("ISO-8859-1"));  
>     encoder = do_CreateInstance(utf8fix_contractid.get(), &res);
>   }
> 

Please note this avoids the crash, but does not fix the failed utf8 encoder
object creation.

Summary: crash on utf8 locales on HP-UX → crash on utf8 locales on HP-UX
-> INT 
Assignee: asa → smontagu
Status: UNCONFIRMED → NEW
Component: Browser-General → Internationalization
Ever confirmed: true
QA Contact: asa → ylong
Why does getting the encoder for LANG=C.utf8 fail?

I see 2 issues here (not that I know a whole lot)
First is that there shoule be a check in nsWindow.cpp
if encoder is null before using it
Second why can't HP-UX find a utf8 encoder?  Isn't that standard?
(and if it isn't, then isn't there some sort of mapping we can do
 in some file so that when we ask for utf8, we just return the ISO-8859-1 thing)?

Copying Shanjian since he has had some HP-UX experience.
Notes from testing with utf8 locale on Linux 

with  "export LANG=en_US.utf8" on Linux Mozillla 14a comes up 
with the following warning:

>Gdk-WARNING **: locale not supported by Xlib, locale set to C

So on linux, mozilla is falling back to "C" locale at GDK level.
On HP-UX we never hits this GDK warning. 

Both HP-UX and Linux mozilla-bin is linked to libgdk-1.2 shared lib.

GDK has the following locale check:

gdk/gdkim.c gdk_set_locale()

   113    if (!XSupportsLocale ())
   114      {
   115        g_warning ("locale not supported by Xlib, locale set to C");
   116        setlocale (LC_ALL, "C");
   117      }

But if I try the following simple test on HP-UX with "C.utf8" and 
on linux with "en_US.utf8" both says supported locale.

#include <stdio.h>
#include <locale.h>
#include <Xm/Xm.h>
int main() 
{
    if (!XSupportsLocale ()) {
        printf( "locale not supported by Xlib, locale set to C\n");
        setlocale (LC_ALL, "C");
    }else{
        printf( "Supported locale\n");
    }
    return 0;
}





Maybe you need a |setlocale(LC_ALL, "");| call before the XSupportsLocale call.
My mistake, calling |setlocale(LC_ALL, "");| before |XSupportsLocale()|| 
does show the expected behaviour with the sample test code.

Linux + en_US.utf8 - "locale not supported by Xlib" 
HP-UX + C.utf8     - "locale supported"

Some updates to the bug.

Adding the above mentiond change, avoids the crash but affected the 
fonts in utf8 locales. The following mapping changes makes it usable.

Added these mappings to res/unixcharset.properties

# HP
locale.all.zh_CN.hp15CN=GB2312
locale.all.zh_HK.hkbig5=Big5-HKSCS
locale.all.ja_JP.utf8=UTF-8
locale.all.ko_KR.utf8=UTF-8
locale.all.zh_CN.utf8=UTF-8
locale.all.zh_TW.utf8=UTF-8
locale.all.zh_HK.utf8=UTF-8

GDK warning has little to do with this bug. It can be easily overcome by adding
aliases to /usr/X11R6/lib/X11/locale/locale.alias
Alternatively, en_US.UTF-8 can be used instead of en_US.utf8. 

As for this bug, res/unixcharset.properties is used only on Unix not fully
compliant to the POSIX standard. It's surprising that HP/UX 11.0 still doesn't
have nl_langinfo(CODESET). Or, is it because HP/UX binary  is built on an HP/UX
10.x or 9.x machine without nl_langinfo().  Enumerating all UTF-8 locales is a
bit silly and we may have to consider special-casing 'UTF-8*', 'utf8*', 'utf-8*'.

> First is that there shoule be a check in nsWindow.cpp
> if encoder is null before using it

 Yes, I think we have to. 

> Second why can't HP-UX find a utf8 encoder?  Isn't that standard?

 Actually, Mozilla doesn't rely on the OS for the UTF-8 conversion. It has its
own built-in converter (unless the native converter is enabled at the
configure-time). Therefore, it should come up fine on any system. Perhaps, it
fails on the reporter's system because somehow Mozilla 1.2 was referring to (an)
old library(ies) (misconfigured LD_LIBRARY_PATH or ....) Can you reproduce the
problem even with a fresh install of Mozilla? 
This is not a 1.2.1  specific bug, this bug exists with the latest 1.4 builds as
well.

> As for this bug, res/unixcharset.properties is used only on Unix not fully
> compliant to the POSIX standard. It's surprising that HP/UX 11.0 still doesn't
> have nl_langinfo(CODESET). Or, is it because HP/UX binary  is built on an HP/UX

HP-UX does support nl_langinfo(CODESET) which is detected correctly during the
configure stage and the builds are done with  "-DHAVE_NL_LANGINFO=1"

> Actually, Mozilla doesn't rely on the OS for the UTF-8 conversion. It has its
> own built-in converter (unless the native converter is enabled at the
> configure-time). Therefore, it should come up fine on any system. Perhaps, it
> fails on the reporter's system because somehow Mozilla 1.2 was referring to (an)
> old library(ies) (misconfigured LD_LIBRARY_PATH or ....) Can you reproduce the
> problem even with a fresh install of Mozilla? 

The  crash is reproducable consistantly on any HP-UX system, with a fresh install 
with a fresh $HOME. We did not specify to use the native converter, so it should
be using the default built-in converter.
> HP-UX does support nl_langinfo(CODESET) which is detected correctly during the
> configure stage and the builds are done with  "-DHAVE_NL_LANGINFO=1"

  If it  works properly under UTF-8 locales, the change you suggested in comment
#7  should not have made any difference at all. The fact that it did is an
indication (not a proof, though) that there's something wrong with
nl_langinfo(CODESET).

What does nl_langinfo(CODESET) return under UTF-8 locales? You can test it with
a simple test program (don't forget to call setlocale() :-))

As for the failure of UTF-8 converter creation, I have little clue. 
 
nl_langinfo(CODESET) return values:

with setlocale (LC_ALL, "")          - roman8
with setlocale (LC_ALL, "C")        - roman8
with setlocale (LC_ALL, "C.utf8") - utf8

> If it  works properly under UTF-8 locales, the change you 
> suggested in comment #7  should not have made any difference 
> at all. The fact that it did is an indication (not a proof, though) 
> that there's something wrong with nl_langinfo(CODESET).

Since the null utf8 encoder causes a crash, the workaround we did
was ( as explained in comment #1 ) when the  utf8 encoder creation
returns null, we create a new ISO-8859-1 encoder which is valid and
use it instead.

So we are forcing mozilla to use an ISO-8859-1 encoder when mozilla
should be using an utf8 encoder. This was the workaround for the crash,
but definetely not the fix for the issue, which was the null utf8 encoder.

My theory is eventhough nl_langinfo works fine on HP-UX,  the failed
utf8 encoder and the workaround ISO-8859-1 encoder were making
utf8 locale unusable, forcing us to resort to  the changes in 
comment #7, which were suggested by our l10n/i18n team




Attached patch a patchSplinter Review
The actual cause is that 'utf8' returned by nl_langinfo(CODESET) of HP/UX is
not recognized as an alias to 'UTF-8'. SUS/POSIX at least should have
standardized the name for UTF-8 to use in nl_langinfo(CODESET) and iconv(3).

Anyway, here's the patch. I didn't include 'ISO-8859-1' fallback because it's
better to make it fail in a conspicuous way (the window title bar would have
some garbages) in cases like this so that we can fix it properly.
Attachment #127981 - Flags: superreview?(blizzard)
Attachment #127981 - Flags: review?(shafalus)
Comment on attachment 127981 [details] [diff] [review]
a patch

r=smontagu.
I have never understood why we don't alias utf8 to utf-8 :-)

I only wonder if the change to widget/src/gtk/nsWindow.cpp needs to happen to
the corresponding files in other subdirectories of widget/ as well.
Attachment #127981 - Flags: review?(shafalus) → review+
Comment on attachment 127981 [details] [diff] [review]
a patch

sr=blizzard
Attachment #127981 - Flags: superreview?(blizzard) → superreview+
Fix checked in. 

Simon was right that there are a couple of more places in widget/src/gtk where
the return status of GetUnicode(En|De)coder* has to be checked. 

In a separate bug, we may have to consider switching to nsAutoPtr from
nsMemorey::Alloc/Free in widget/src/gtk
Comment on attachment 127981 [details] [diff] [review]
a patch

This is to fix crash on HP/UX (and possibly on other Unix) under UTF-8 locale.
The risk is very low on other Unix (let alone non-Unix).
Attachment #127981 - Flags: approval1.4.x?
Attachment #128049 - Flags: superreview?(blizzard)
Attachment #128049 - Flags: review?(shafalus)
Keywords: crash
Verified the fix on HP-UX, no crash on utf8 locales now .. Thank You !
I went through other directories in widget/src and found two more places where
the return value is not checked.
Attachment #128049 - Attachment is obsolete: true
Comment on attachment 128491 [details] [diff] [review]
128049 + xlib and xpwidget fixes

asking for r/sr
Attachment #128491 - Flags: superreview?(blizzard)
Attachment #128491 - Flags: review?(smontagu)
Comment on attachment 128491 [details] [diff] [review]
128049 + xlib and xpwidget fixes

r=smontagu. Maybe you can get a blanket rs for these return-value checks?
Attachment #128491 - Flags: review?(smontagu) → review+
Comment on attachment 128049 [details] [diff] [review]
a patch to fix a couple of more places in gtk

blizzard, can I get blanket rs for similar changes as smontagu suggested?
Attachment #128049 - Flags: superreview?(blizzard)
Attachment #128049 - Flags: review?(smontagu)
Attachment #128491 - Flags: superreview?(blizzard) → superreview+
fix checked into the trunk. I'm keeping this open until I get a1.4 because it's
a critical bug on HP/UX. 

Assignee: smontagu → jshin
Which is the patch you want in 1.4? Just the first one?
Although the second patch is not as much critical as the first one, it'd be nice to fill up 
all the holes we found. So, can you approve both patches for 1.4.x? Thanks 
  
Status: NEW → ASSIGNED
Flags: blocking1.5b?
Attachment #127981 - Flags: approval1.4.x? → approval1.4.x+
Attachment #128491 - Flags: approval1.4.x+
thanks for a1.4. fix checked into 1.4.x branch. marking as resolved.
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Flags: blocking1.5b?
Keywords: fixed1.4.1
Blocks: 224532
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: