Mozilla will crash after localizing some strings.

VERIFIED FIXED in mozilla0.9.8


17 years ago
7 years ago


(Reporter: mwawoczny, Assigned: alecf)


(4 keywords)

crash, hang, intl, regression

Firefox Tracking Flags

(Not tracked)



(8 attachments, 2 obsolete attachments)



17 years ago
Mozilla will crash after localizing some strings.
Reproducibility: Always :(((

Attaching 2 localized files:
pref-validation.dtd - with this file Mozilla will crash (or will use 100% CPU) after choosing Preferences->Security and Privacy->Validation
pref-validation.dtd.OK - with this file Mozilla won't crash, changed one string (I've truncated it) - validation.crl.autoupdate.title

Also same thing is happening when localizing some strings in other Mozilla components (I've also had a crash with Mail, after little change to string it is not crashing).

It's started to happen around 20011203 builds. With older builds never had a problem with localizing.

Mozilla/5.0 (Windows; U; Windows NT 5.0; pl-PL; rv:0.9.6+) Gecko/20011207

Comment 1

17 years ago
Created attachment 60891 [details]
Bad file

With this file Mozilla will crash

Comment 2

17 years ago
Created attachment 60892 [details]
Good file

With this file Mozilla won't crash

Comment 3

17 years ago
First talkback (from Mail crash):

Changing component to Browser General coz I don't know if this bug belongs to internationalization/localization.
Component: Internationalization → Browser-General

Comment 4

17 years ago
Talkback shows 
[d:\builds\seamonkey\mozilla\xpcom\io\nsUnicharInputStream.cpp, line 222] 

May be to do with bug 110531 which is checked in on Dec 3rd.
cc'ing tao,alecf

Comment 5

17 years ago
hmm. that's not good.
what is the difference between these files?  Its possible that a multi-byte utf8
character exists past a the size of the 8k buffer that is used to read the file.
The code is supposed to handle that cleanly, but maybe its not.

also, there was a 2nd modification to nsUnicharInputStream.cpp by dbaron - could
you try backing out that change, and trying again? We didn't have a sample file
to work off of.

Comment 6

17 years ago
The first (bad) appears to be in utf-8 while the second one is in iso-8859-1. 
Mozilla should be able to parse both correctly. Is there any new change to the 
class (or method): UTF8InputStream::Read? Is this a regression?


Comment 7

17 years ago
Hmmm... Both files are in UTF8, the only difference is one property:

Comment 8

17 years ago
Talkback for crash with attached files:

Comment 9

17 years ago
>The first (bad) appears to be in utf-8 while the second one is in iso-8859-1. 
I take it back; you're right.

Comment 10

17 years ago
Here is the diff:

< <!ENTITY  validation.crl.autoupdate.title            "Preferencje automatyczne
j aktualizacji listy CRL">
> <!ENTITY  validation.crl.autoupdate.title            "Preferencje automatyczne
j aktualizacji">

Can't see anything wrong.

Comment 11

17 years ago
Thats why I filled this bug - strings are properly encoded etc. but after changing "Preferencje automatycznej aktualizacji" to "Preferencje automatycznej aktualizacji listy CRL" Mozilla is crashing :( As Roy Yokoyama suggested this can be something to do with 

Comment 12

17 years ago
I've never tried using a new locale/etc. can someone catch this in a debugger
and figure out exactly what's causing the crash? We could also run this in
purify and see what it says.

Comment 13

17 years ago
alecf: i'll try once I get the debug ready.

Comment 14

17 years ago
I make Japanese Localization Pack for Mozilla 0.9.7
When Mozilla 0.9.7 on Win2000 launch as Japanese, then crash.
Is the reason of crash this bug?

I send TALKBACK. ID is TB1045238G

Comment 15

17 years ago
Seems that the combinations of the guilty strings are endless. With the Greek
0.9.7   langpack mozilla crashes with every cobination of localized and
untranslated strings I've tried so far. 


17 years ago
Ever confirmed: true
QA Contact: teruko → ruixu

Comment 16

17 years ago
Also pls, take a look at the "Help! Obstinate error probably in
messengercompose.dtd" thread on the n.p.m.l10n ng.

Comment 17

17 years ago
additional info for #14
backtrace of my own debug build (0.9.7).

#0  0x2b04acb7 in memcpy (dstpp=0x2c8f4e04, srcpp=0x8431a6c, len=2048)
    at ../sysdeps/generic/memcpy.c:55
#1  0x2ac1bce5 in UTF8InputStream::Read (this=0x82d52c8, aBuf=0x2c891008,
    aOffset=204542, aCount=1024, aReadCount=0x7fffe9ec)
    at ../../dist/include/xpcom/nsCRT.h:118
#2  0x2b3e63ca in nsExpatTokenizer::LoadStream (in=0x8417120,
    uniBuf=@0x7fffea40, retLen=@0x7fffea44) at nsExpatTokenizer.cpp:891
#3  0x2b3e668e in Tokenizer_HandleExternalEntityRef (parser=0x82256d0,
    openEntityNames=0x0, base=0x83f2af0 "c", systemId=0x83f2b4e "c",
    publicId=0x0) at nsExpatTokenizer.cpp:960
#4  0x2b4115f1 in doProlog (parser=0x82256d0, enc=0x2b4459a0, s=0x84215fa ">",
    end=0x8422ec8 "\b ", tok=17, next=0x84215fc "\n", nextPtr=0x7fffec94)
    at xmlparse.c:2324
#5  0x2b4111e1 in prologProcessor (parser=0x82256d0, s=0x8420ec8 "<",
    end=0x8422ec8 "\b ", nextPtr=0x7fffec94) at xmlparse.c:2201
#6  0x2b411192 in prologInitProcessor (parser=0x82256d0, s=0x8420ec8 "<",
    end=0x8422ec8 "\b ", nextPtr=0x7fffec94) at xmlparse.c:2190
#7  0x2b40ece9 in XML_Parse (parser=0x82256d0, s=0x8420ec8 "<", len=8192,
    isFinal=0) at xmlparse.c:893
#8  0x2b3e3b4e in nsExpatTokenizer::ParseXMLBuffer (this=0x83ff878,
    aBuffer=0x8420ec8 "<", aLength=8192, aIsFinal=0)
    at nsExpatTokenizer.cpp:510
#9  0x2b3e3cdc in nsExpatTokenizer::ConsumeToken (this=0x83ff878,
    aScanner=@0x842d728, aFlushTokens=@0x7fffeda0) at nsExpatTokenizer.cpp:561
#10 0x2b401d65 in nsParser::Tokenize (this=0x842c300, aIsFinalChunk=0)
    at nsParser.cpp:2587
#11 0x2b3ff1e3 in nsParser::ResumeParse (this=0x842c300, allowIteration=1,
    aIsFinalChunk=0) at nsParser.cpp:1845
#12 0x2b401b0a in nsParser::OnDataAvailable (this=0x842c300,
    request=0x842d9e0, aContext=0x0, pIStream=0x842de20, sourceOffset=0,
    aLength=6100) at nsParser.cpp:2469



17 years ago
Priority: -- → P2
Target Milestone: --- → mozilla0.9.8

Comment 18

17 years ago
 the aOffset=204542 in

#1  0x2ac1bce5 in UTF8InputStream::Read (this=0x82d52c8, aBuf=0x2c891008,
    aOffset=204542, aCount=1024, aReadCount=0x7fffe9ec)

looks quite large.  Is that reasonable ? 200 K??? which stream ?

It looks like alecf check in 3.27 of mozilla/xpcom/io/nsUnicharInputStream.cpp
change the unicode conversion. How many test have been done on that change ?
Do we maintain state information across buffer boundary ?
Can someone back off that code in local tree to see what will happen ?

alecf- no need to switch locale, just temporary replace the content of the
attachment to your pref-validation.dtd
I assume /security/manager/pki/resources/locale/en-US/pref-validation.dtd 

i think this is a buffer boundary problem. The file is 3993 bytes > 2048 bytes.
and I guess the boundary bytes (byte 2048) happen to be a utf8 character. (just
guessing here. need to exam)

Assignee: yokoyama → alecf

Comment 19

17 years ago
yokoyama: I think you are right, though I think the buffer size is 8192. Maybe
the utf8 conversion is simply broken. I'll reexamine this asap.

Comment 20

17 years ago
ok, 2048 in dec is 4000 in hex. 
I download the bad file (save it as aaa.txt) and do the following
Z:\>^Cod --format "x1" aaa.txt|egrep "3760|4000"
0003760 6c 65 2e 6c 61 62 65 6c 20 20 20 20 20 22 57 c5
0004000 82 c4 85 63 7a 20 61 75 74 6f 6d 61 74 79 63 7a

notice the (2k block) boundar is 0xc5 and the next character in the 2nd block is

Comment 21

17 years ago
how to reproduce this bug?
1. back up your english version pref-validation.dtd in
/security/manager/pki/resources/locale/en-US/ to pref-validation.dtd.en

3. save the "Bad file" as pref-validation.dtd.bad
4. save the "good file" as pref-validation.dtd.good
5. cp pref-validation.dtd.good pref-validation.dtd 
6. cd security/manager/pki/resources and do nmake -f
7. launch mozilla and look at preference:security:validation
it is translated, no hang
8. now cp pref-validation.dtd.bad pref-validation.dtd 
do setp 6 and 7.
it now hang.

Comment 22

17 years ago
I think it is related to block boundary issue but not sure it is 2048 bytes. 
I back off mozilla/xpcom/io/nsUnicharInputStream.cpp to r3.26 and it now won't
hang with the dtd file. 
so I am sure the problem is in cvs diff -r 3.26 -r 3.29

Comment 23

17 years ago
has any one tried with a debug build? There are a bunch of NS_ASSERTIONS and I'm
curious if they're called or not.. 

Comment 24

17 years ago
add intl,regression,nsbeta1 keyword
Keywords: intl, nsbeta1, regression

Comment 25

17 years ago
add crash,hang keyword
Keywords: crash, hang

Comment 26

17 years ago
>I think it is related to block boundary issue but not sure it is 2048 bytes. 
>I back off mozilla/xpcom/io/nsUnicharInputStream.cpp to r3.26 and it now won't
>hang with the dtd file. 
>so I am sure the problem is in cvs diff -r 3.26 -r 3.29
I take this back. I didn't complete the rebuild. Need more test.

Comment 27

17 years ago
ok CountValidUTF8Bytes have problem, I only look at the first byte of UTF8 to
decide the length, ignoring we have the whole UTF8 seq in buffer or not. for
example, the 2nd time it call when loading  
"locale/en-US/pippki/pref-validation.dtd" the return value of 
in UTF8InputStream::Fill is 0x401 even the remainder + nb is 0x400 

Comment 28

17 years ago
Created attachment 64178 [details] [diff] [review]
debug patch bark when something wrong- didn't fix the problem 

this is a FIXING patch, but a patch to put Assertion that surface problem. 
The problem is tha count valid utf8 byte code.

Comment 29

17 years ago
alecf, you should got enough information to fix it now. :) let me know if you
need more help. Thanks.

Comment 30

17 years ago
thanks for the investigation, frank. sounds like an off-by-one - maybe we're
trying to include the last incomplete utf8 character even though it extends past
the current buffer.


17 years ago
Priority: P2 → P1

Comment 31

17 years ago
I think this was dbaron's change:

previously, we would back up to the last byte, but with his change we don't.
I'm trying to work on a fix now.

Comment 32

17 years ago
can someone post a bad DTD that isn't in the security code? I'm having trouble
building security on my machine

Comment 33

17 years ago
Created attachment 64337 [details] [diff] [review]
attempted fix

I think this will do the trick. can anyone try this patch? It might be faster
than me waiting for a new DTD... frank?
Comment on attachment 64337 [details] [diff] [review]
attempted fix

If there are 0 bytes, then that would be a UMR, no?  I don't see a solution
other than double-initializing lastchar the first time around, though.

Comment 35

17 years ago
Created attachment 64353 [details] [diff] [review]
fix, avoid umr

oh, right! :)

Let's try this one.
Attachment #64337 - Attachment is obsolete: true

Comment 36

17 years ago
Hmmm.... Another bad file, now double \n\n (instead of \n) is crashing Mozilla, probably this is the same problem (buffer), but who knows... I've also noticed that in *.properties files string after comments "--> \n" was also crashing Mozilla, I've had to change all "--> \n" to "-->\n" (without space) - but can't send example - I forgot in which *.properties file was it :(

Comment 37

17 years ago
Created attachment 64367 [details]
Bad file with \n\n instead of \n

Comment 38

17 years ago
which file is this?

Comment 39

17 years ago
actually, I highly doubt \n's are related to this bug.

Comment 40

17 years ago
Another bug?
File: "Bad file with \n\n instead of \n"

Comment 41

17 years ago
Created attachment 64370 [details]
This is another crasher: editor/editorOverlay.dtd

Comment 42

17 years ago
Created attachment 64371 [details]
Here it is navigator/navigator.dtd happily crashing moz. If you need more, you just need to ask.


17 years ago
Attachment #64370 - Attachment mime type: application/octet-stream → text/plain

Comment 43

17 years ago
ok, so its turning out to be more of a pain than I thought it would be. can you
attach a bad it's the first file loaded via UTF8InputStream

Comment 44

17 years ago
Created attachment 64394 [details] [diff] [review]
one more try, update mByteDataOffset correctly

I really don't understand how the old code worked at all in this case, unless
our buffers were always big enough.. in any case, I needed to correctly update
mByteDataOffset by setting it to srcLen, because we're always starting with a
freshly loaded bytebuffer. That also means setting it above is useless.

With this, I am able to load the navigator.dtd that was posted here.
Attachment #64353 - Attachment is obsolete: true

Comment 46

17 years ago
Comment on attachment 64394 [details] [diff] [review]
one more try, update mByteDataOffset correctly

Attachment #64394 - Flags: superreview+

Comment 47

17 years ago
ok, fix just landed. tomorrow's build should have the fix. 

Comment 48

17 years ago
I think this was broken in 0.9.7 though, so if people are doing localized 0.9.7
builds, then I think this needs to be fixed on that branch... localizers, is
this the case? this patch should apply cleanly on 0.9.7

in the meantime I'll mark this fixed since it's been fixed on the trunk
Last Resolved: 17 years ago
Resolution: --- → FIXED
Reopening, since alecf backed it out since it caused Linux to crash on startup
(CSS-parsing related).

The following part of the checkin was fine, and could be checked back in (you
still have r=dbaron):

Index: nsUnicharInputStream.cpp
RCS file: /cvsroot/mozilla/xpcom/io/nsUnicharInputStream.cpp,v
retrieving revision 3.29
retrieving revision 3.30
diff -u -d -r3.29 -r3.30
--- nsUnicharInputStream.cpp	5 Dec 2001 03:49:13 -0000	3.29
+++ nsUnicharInputStream.cpp	11 Jan 2002 20:34:11 -0000	3.30
@@ -274,8 +275,10 @@
   const char *c = aBuffer;
   const char *end = aBuffer + aMaxBytes;
+  const char *lastchar = c;     // pre-initialize in case of 0-length buffer
   while (c < end && *c) {
+    lastchar = c;
     if (UTF8traits::isASCII(*c))
     else if (UTF8traits::is2byte(*c))
@@ -293,6 +296,8 @@
       break; // Otherwise we go into an infinite loop.  But what happens now?
+  if (c > end)
+    c = lastchar;
   return c - aBuffer;

The part of the checkin that caused Linux to crash on startup was:

Index: nsUnicharInputStream.cpp
RCS file: /cvsroot/mozilla/xpcom/io/nsUnicharInputStream.cpp,v
retrieving revision 3.29
retrieving revision 3.30
diff -u -d -r3.29 -r3.30
--- nsUnicharInputStream.cpp	5 Dec 2001 03:49:13 -0000	3.29
+++ nsUnicharInputStream.cpp	11 Jan 2002 20:34:11 -0000	3.30
@@ -233,7 +233,7 @@
   NS_ASSERTION(mByteData->GetLength() >= mByteDataOffset, "unsigned madness");
   PRUint32 remainder = mByteData->GetLength() - mByteDataOffset;
-  mByteDataOffset = remainder;
   PRInt32 nb = mByteData->Fill(aErrorCode, mInput, remainder);
   if (nb <= 0) {
     // Because we assume a many to one conversion, the lingering data
@@ -247,6 +247,7 @@
   // Now convert as much of the byte buffer to unicode as possible
   PRInt32 srcLen = CountValidUTF8Bytes(mByteData->GetBuffer(),remainder + nb);
+  NS_ASSERTION( (remainder+nb >= srcLen), "cannot be longer than out buffer");
@@ -264,7 +265,7 @@
   mUnicharDataOffset = 0;
   mUnicharDataLength = dstLen;
-  mByteDataOffset += srcLen;
+  mByteDataOffset = srcLen;
   return dstLen;
Resolution: FIXED → ---

Comment 50

17 years ago
-  mByteDataOffset = remainder;

The trouble is actually with the above line.

In the case that |nb| is 0, we'll return early. In the old code, this means
mByteDataOffset was changed (always to 0, I suspect, but can't assert just yet),
in the new code it won't be changed, hence the assertions and problems.

If you just leave that line in, everything works just fine, though I think you
could change this to |mByteDataOffset = 0;| inside the |if (nb <= 0)| block.

Comment 51

17 years ago
ack! I'll just leave the line in. I think I may have removed it just before
checkin because it seemed useless. dumb me.

Comment 52

17 years ago
ok, checked in the patch without that line.
marking fixed again :)
Last Resolved: 17 years ago17 years ago
Resolution: --- → FIXED

Comment 53

17 years ago
It's response for #48:
If possible I want to apply this patch to 0.9.7 branch. Because, in Japanese
localize, it is most .dtd and the properties file since it is necessary to
encode as UTF-8.

Comment 54

17 years ago
Would also be a great pain to create a release ?
It would be nice to get a simpler way for the user to understand if his
version contains this bug.
In the language packs download page we could just state "this language pack
works only with".

Comment 55

17 years ago
Created attachment 64678 [details]
Russian navigator/

Attached the russian (maybe a little late).
It *doesn't* crash my 0.9.7 mozilla. On my quick tests I've not found a which suffice to crash the app. It might just be a chance

Even if it's not crashing the app, I've found it makes disappear the default
message on the status bar (the loading completion message) and making d&d on
the home personal toolbar button, the confirmation dialog text for the home
change is missing.

Sidenote: -probably off topic- The nsconv.exe util shipped with Moz097Win32
work. It returns: "Cannot get Character Converter Manager 8000ffff"

Comment 56

17 years ago
cc asa for 0.9.7 patch options, now that 0.9.7 is actually out the door..
Asa: Basically we need to take this patch in order to release 0.9.7 in any
non-ASCII-based language.

Comment 57

17 years ago
if we're collecting fixes for a 0.9.7 interim release don't forget the upload 
form bug.  but i think eol'ing 0.9.7 in favor of 0.9.8 might be more expedient. 
[this comment composed w/o reference to a dictionary; feel free to school me]

Comment 58

17 years ago
Verified with trunk 2002-01-17, cannot repro it.

Comment 59

17 years ago
*** Bug 119758 has been marked as a duplicate of this bug. ***
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.