28500 - illegal nsString usage Style sheet loader does not work with non ISO-8859-1 style sheet.

Reporter

Description

•

25 years ago

I catch this by using the technique I stated in 28424. Basically, this mean people cannot use Japanse font face name in the .css . Japanese font face is very common in Japan/China/Korea/Taiwan/HongKong line 608 is bad because it pass string, which could be in charset other than ISO-8859-1 to nsString. 601 warren 3.27 NS_IMETHODIMP 602 valeski 3.35 SheetLoadData::OnStreamComplete(nsIStreamLoader* aLoader, 603 nsISupports* context, 604 nsresult aStatus, 605 PRUint32 stringLen, 606 const char* string) 607 peterl 3.1 { 608 warren 3.31 nsString aStyleData(string, stringLen); 609 warren 3.27 mLoader->DidLoadStyle(aLoader, aStyleData, this, aStatus); 610 611 peterl 3.1 // We added a reference when the loader was created. This 612 // release should destroy it. 613 NS_RELEASE(aLoader); 614 warren 3.27 return NS_OK; 615 peterl 3.1 } Reassign to pierre. IQA, can you prepare test cases for non ASCII face name in .css file to test against this issue ? tao, I wonder how can we make localizable css if we cannot put Japanese face name ? Is this beta1 ?

Frank Tang

Reporter

Comment 1

•

25 years ago

tao/bobj/ftang said this is probabaly not a beta1 because 1. if the css failed to find the font, gfx will still find a japanese font to display the localizable text. So it is still readable 2. There are not that many page use exteranl css to contains Japanese face name (yet). so we think this is not Beta1. However, css support is important for beta2 localization . So this is a beta2 bugs.

Keywords: beta2

bobj

Comment 2

•

25 years ago

fixed typo in summary

Summary: Style shee loader does not work with non ISO-8859-1 style sheet. → Style sheet loader does not work with non ISO-8859-1 style sheet.

Pierre Saslawsky

Comment 3

•

25 years ago

nsCSSLoader used to inherit from nsIUnicharStreamLoader which seemed to satisfy i18n but on Feb-2, valeski changed that. The CVS log says "r=warren. nsIUnicharStreamLoader is dead. Now we have a generic byte stream loader that can be used for any sort of data." Reassigned to valeski.

Assignee: pierre → valeski

Component: Style System → Networking

Judson Valeski

Comment 4

•

25 years ago

ftang, What's the bug here? we're dropping a char* (which could be anything, unicode for example) in to a nsString. nsString(const PRUnichar *) and nsString(const char *) use the *exact* same construction which sets the internal byte size to 2 bytes.

Judson Valeski

Comment 5

•

25 years ago

nsIStreamLoader loads arbitrary data, from binary to unicode to char*. It's up to the implementor to determine context, in this case we're casting everything to PRUnichar anyway. invalidating.

Status: NEW → RESOLVED

Closed: 25 years ago

Resolution: --- → INVALID

Christine Hoffman

Updated

•

25 years ago

QA Contact: chrisd → tever

Frank Tang

Reporter

Comment 6

•

25 years ago

I have to reopen this bug >we're dropping a char* (which could be anything, >unicode for example) in to a nsString. nsString(const PRUnichar *) and >nsString(const char *) use the *exact* same construction which sets the >internal byte size to 2 bytes. It is illegal to pass any data which might be not ASCII to nsString by using the char* flavor method. char* could be any byte combination, PRUnichar* is only for Unicode (not for any TWO byte process code. 0x3456 only have ONE MEANING in PRUnichar while 0xA5 may mean different thing in char* depend on it's encoding) It is only legal to use the char* flavor of nsString method if the data is 100% sure to be only in ASCII. In this case, it is illegal. This will cause stylesheet in non ISO-8859-1 convert incorrectly. The code convert from a char* to nsString should perform the correct conversion. add rickg to cc list I don't think it is hard to fix this bug- Read http://www.w3.org/TR/REC-CSS2/grammar.html stylesheet : [ CHARSET_SYM S* STRING S* ';' ]? [S|CDO|CDC]* [ import [S|CDO|CDC]* ]* [ [ ruleset | media | page | font_face ] [S|CDO|CDC]* ]* ; and also "@charset" {return CHARSET_SYM;} Therefore, we know "@charset" can only possible appeared in the first byte of the stylesheet. It mean it is easy to find the charset- 1. compare "@charset" with string - if it match, strip out one or mor S and scan through ";" - this is the charset. 2. Once you find the charset, resolve the charset alias and find a nsIUnicodeDecoder as the following 131 nsresult nsScanner::SetDocumentCharset(const nsString& aCharset , nsCharsetSource aSource) { 132 133 nsresult res = NS_OK; 134 135 if( aSource < mCharsetSource) // priority is lower the the current one , just 136 return res; 137 138 NS_WITH_SERVICE(nsICharsetAlias, calias, kCharsetAliasCID, &res); 139 NS_ASSERTION( nsnull != calias, "cannot find charset alias"); 140 nsAutoString charsetName = aCharset; 141 if( NS_SUCCEEDED(res) && (nsnull != calias)) 142 { 143 PRBool same = PR_FALSE; 144 res = calias->Equals(aCharset, mCharset, &same); 145 if(NS_SUCCEEDED(res) && same) 146 { 147 return NS_OK; // no difference, don't change it 148 } 149 // different, need to change it 150 res = calias->GetPreferred(aCharset, charsetName); 151 152 if(NS_FAILED(res) && (kCharsetUninitialized == mCharsetSource) ) 153 { 154 // failed - unknown alias , fallback to ISO-8859-1 155 charsetName = "ISO-8859-1"; 156 } 157 mCharset = charsetName; 158 mCharsetSource = aSource; 159 160 NS_WITH_SERVICE(nsICharsetConverterManager, ccm, kCharsetConverterManagerCID, &res); 161 if(NS_SUCCEEDED(res) && (nsnull != ccm)) 162 { 163 nsIUnicodeDecoder * decoder = nsnull; 164 res = ccm->GetUnicodeDecoder(&mCharset, &decoder); 165 if(NS_SUCCEEDED(res) && (nsnull != decoder)) 166 { 167 NS_IF_RELEASE(mUnicodeDecoder); 168 169 mUnicodeDecoder = decoder; 170 } 171 } 172 } 173 return res; 174 } 3. Then you get a charset converter manager service and ask for a nsIUnicodeDecoder, you then can convert the char* to PRUnichar* by using it- see 293 PRInt32 srcLength = aLen; 294 PRInt32 unicharLength = unicharBufLen; 295 res = mUnicodeDecoder->Convert(aBuffer, &srcLength, unichars, &unicharLength); 296 unichars[unicharLength]=0; //add this since the unicode converters can't be trusted to do so. 297 298 mBuffer.Append(unichars, unicharLength); PRUnichar* have very strick semantics, every value in PRUnichar is strickly defined with one human readable character. (e.g. 0x0041 mean 'A' and 0x4E00 mean Chinese character one) You CANNOT use PRUnichar as arbitrary byte combination. If you want to pass arbitrary data , use nsCString, which do not have strict meaning of the byte combination except the value 0x00 is reserved for null termination. Also your argument about "nsString(const PRUnichar *) and nsString(const char *) use the *exact* same construction which sets the internal byte size to 2 bytes." is WRONG: http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsString2.cpp 76 nsString::nsString(const char* aCString,PRInt32 aCount){ 77 nsStr::Initialize(*this,eTwoByte); 78 Assign(aCString,aCount); 79 } 87 nsString::nsString(const PRUnichar* aString,PRInt32 aCount) { 88 nsStr::Initialize(*this,eTwoByte); 89 Assign(aString,aCount); 90 } It looks like the same, but the Assign in line 78 and the Assign in line 89 is NOT exactly the same Assign. The Assign in line 78 is the const char* version which is implemented as 975 nsString& nsString::Assign(const char* aCString,PRInt32 aCount) { 976 nsStr::Truncate(*this,0); 977 if(aCString){ 978 Append(aCString,aCount); 979 } 980 return *this; 981 } 982 The Assign in line 89 is the const PRUnichar* version which is implemented as 992 nsString& nsString::Assign(const PRUnichar* aString,PRInt32 aCount) { 993 nsStr::Truncate(*this,0); 994 if(aString){ 995 Append(aString,aCount); 996 } 997 return *this; 998 } Again, they looks very similar except one is the const char* version and one is the const PRUnichar* version The const char* version of append is implemented as 1085 nsString& nsString::Append(const char* aCString,PRInt32 aCount) { 1086 if(aCString && aCount){ //if astring is null or count==0 there's nothing to do 1087 nsStr temp; 1088 Initialize(temp,eOneByte); 1089 temp.mStr=(char*)aCString; 1090 1091 if(0<aCount) { 1092 temp.mLength=aCount; 1093 1094 // If this assertion fires, the caller is probably lying about the length of 1095 // the passed-in string. File a bug on the caller. 1096 1097 #ifdef NS_DEBUG 1098 PRInt32 len=nsStr::FindChar(temp,0,PR_FALSE,0,temp.mLength); 1099 if(kNotFound<len) { 1100 NS_WARNING(kPossibleNull); 1101 } 1102 #endif 1103 1104 } 1105 else aCount=temp.mLength=nsCRT::strlen(aCString); 1106 1107 if(0<aCount) 1108 nsStr::Append(*this,temp,0,aCount); 1109 } 1110 return *this; 1111 } while the const PRUnichar* version of append is implemented in 1122 nsString& nsString::Append(const PRUnichar* aString,PRInt32 aCount) { 1123 if(aString && aCount){ //if astring is null or count==0 there's nothing to do 1124 nsStr temp; 1125 Initialize(temp,eTwoByte); 1126 temp.mUStr=(PRUnichar*)aString; 1127 1128 if(0<aCount) { 1129 temp.mLength=aCount; 1130 1131 // If this assertion fires, the caller is probably lying about the length of 1132 // the passed-in string. File a bug on the caller. 1133 #ifdef NS_DEBUG 1134 PRInt32 len=nsStr::FindChar(temp,0,PR_FALSE,0,temp.mLength); 1135 if(kNotFound<len) { 1136 NS_WARNING(kPossibleNull); 1137 } 1138 #endif 1139 1140 } 1141 else aCount=temp.mLength=nsCRT::strlen(aString); 1142 1143 if(0<aCount) 1144 nsStr::Append(*this,temp,0,aCount); 1145 } 1146 return *this; 1147 } Finally the nsStr::Append 158 void nsStr::Append(nsStr& aDest,const nsStr& aSource,PRUint32 anOffset,PRInt32 aCount){ 159 if(anOffset<aSource.mLength){ 160 PRUint32 theRealLen=(aCount<0) ? aSource.mLength : MinInt(aCount,aSource.mLength); 161 PRUint32 theLength=(anOffset+theRealLen<aSource.mLength) ? theRealLen : (aSource.mLength-anOffset); 162 if(0<theLength){ 163 164 PRBool isBigEnough=PR_TRUE; 165 if(aDest.mLength+theLength > aDest.mCapacity) { 166 isBigEnough=GrowCapacity(aDest,aDest.mLength+theLength); 167 } 168 169 if(isBigEnough) { 170 //now append new chars, starting at offset 171 (*gCopyChars[aSource.mCharSize][aDest.mCharSize])(aDest.mStr,aDest.mLength,aSour ce.mStr,anOffset,theLength); 172 173 aDest.mLength+=theLength; 174 AddNullTerminator(aDest); 175 NSSTR_SEEN(aDest); 176 } 177 } 178 } 179 } notice line 171. for the const char* version of nsString constructor, aSource.mCharSize is eOneByte and aDest.mCharSize is eTwoByte . For the const PRUnichar* version of nsString constructor, both are eTwoByte . Therefore, for the const* char version of nsString constructor, gCopyChars[0][1], which is CopyChars1To2, will be called. But for the const PRUnichar* version of nsString constructor, gCopyChars[1][1], which is CopyChars2To2, will be called.

Blocks: 28424

Status: RESOLVED → REOPENED

Resolution: INVALID → ---

Frank Tang

Reporter

Updated

•

25 years ago

Summary: Style sheet loader does not work with non ISO-8859-1 style sheet. → illegal nsString usage Style sheet loader does not work with non ISO-8859-1 style sheet.

Judson Valeski

Comment 7

•

25 years ago

over to frank. frank, if you have a solution, please apply it.

Assignee: valeski → ftang

Status: REOPENED → NEW

Frank Tang

Reporter

Comment 8

•

25 years ago

Over to pierre- module owner of style system. Module owner should take care functional issue. Pierre - I have already put down the details about how to add "@charset" support and how to remove the illegal usage of the char* flavor of nsString constrctor/method in this bug report. Please fix it. Thanks

Assignee: ftang → pierre

Frank Tang

Reporter

Comment 9

•

25 years ago

This is a style system issue, not networking issue. It is up to the style system decide how to convert char* into PRUnichar. Change to componment back to "Style System" and add erik to the CC.

Component: Networking → Style System

Pierre Saslawsky

Comment 10

•

25 years ago

ftang: You wrote "I have already put down the details about how to add "@charset" support". Is it related to bug 2870? Cowardly reassigning i18n/style issues to attinasi. On my list, they are likely to stay buried for a very long time.

Assignee: pierre → attinasi

Marc Attinasi

Assignee

Updated

•

25 years ago

Status: NEW → ASSIGNED

Frank Tang

Reporter

Comment 11

•

25 years ago

Yes, I think this is the "cause"of bug 2870.

Marc Attinasi

Assignee

Updated

•

25 years ago

Target Milestone: M20

Frank Tang

Reporter

Comment 12

•

25 years ago

attinasi- can you put down the ETA of check in the fix ? contact me if you need some help of information.

Marc Attinasi

Assignee

Updated

•

25 years ago

Target Milestone: M20 → M16

Marc Attinasi

Assignee

Updated

•

25 years ago

Blocks: 8924

Marc Attinasi

Assignee

Comment 13

•

25 years ago

The code mentioned in the original bug report has been changed to use the AssignWithConversion method on the nsString. Is this satisfactory? All of the explicit charset detection and translations happen later, when the string is turned into a stream and hooked up to a Converter in DidLoadStyle (not yet checked in), but I want to make sure the AssignWithConversion will prevent the data loss before the DidLoadStyle is called. Here is the code, as modified by scc on 4/3/2000: NS_IMETHODIMP SheetLoadData::OnStreamComplete(nsIStreamLoader* aLoader, nsISupports* context, nsresult aStatus, PRUint32 stringLen, const char* string) { nsString aStyleData; aStyleData.AssignWithConversion(string, stringLen); mLoader->DidLoadStyle(aLoader, aStyleData, this, aStatus); // We added a reference when the loader was created. This // release should destroy it. NS_RELEASE(aLoader); return NS_OK; } Many thanks, i18n experts!

leger

Updated

•

25 years ago

Keywords: nsbeta2

Marc Attinasi

Assignee

Comment 14

•

24 years ago

Fix checked in.

Status: ASSIGNED → RESOLVED

Closed: 25 years ago → 24 years ago

Keywords: beta2

Resolution: --- → FIXED

Tom Everingham

Comment 15

•

24 years ago

verified

Status: RESOLVED → VERIFIED

BMO Automation

Comment 16

•

2 years ago

Attached file [mozilla-mobile/fenix] Run CI for #28500 (#28637) — Details