Closed Bug 137239 Opened 22 years ago Closed 7 years ago

</title> is lost if no charset is specified

Categories

(Core :: Internationalization, defect)

x86
All
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
mozilla1.2beta

People

(Reporter: kazhik, Assigned: jshin1987)

References

Details

(Keywords: intl)

</title> is lost if no charset is specified, and the page is displayed
as blank.

Testcases:
http://kazhik.net/mozilla/test/2052-1.html
http://kazhik.net/mozilla/test/2052-2.html

See these pages with character coding = Shift_JIS.

The first page is displayed as blank.

In the second file I put a blank between the title string and </title>.
But the left frame is displayed as blank. The right frame has charset
information and is displayed correctly.

Original report in Bugzilla-jp:
http://bugzilla.mozilla.gr.jp/show_bug.cgi?id=2052
Keywords: intl
QA Contact: ruixu → ylong
Hum, I don't see the difference between the 2 test pages on both 04-12 trunk and
branch build.

However, when I set charset to shift-jis, the page does display as blank, and
when set charset to correct one - EUC-JP, the page (both frames) will display fine.
With other wrong charset, e.g iso-8859-1, big5...etc. will get a display garbled
page but not blank page.
In our charset converter, if the leading byte indicate that it is a 2-byte
character, even if the 2nd byte is not in valid range, it will be eaten. In this
case, the '<' character of "</title> was eaten inside SJIS to unicode converter.
This caused failure in parsing and lead to blank page. 

reassign to frank, he may have similar bugs. I remember somebody suggested in
some bug that we should only eat one char in such scenario. 
Assignee: yokoyama → ftang
I think the reporter miss one step- you need to set your default encoding to
"Shift-JIS" first in the language pref. If your default encoding is "ISO-8859-1"
then there are no problem. 

Maybe we should do the following
In DBCS if we hit an illegal sequence, if the next char is '<' EAT one bytes, if
the next char is not a "<" eat two bytes if the lead bytes indicate it is a two
byte sequence. 

Take this bug moz1.1beta
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.1beta
Target Milestone: mozilla1.1beta → ---
we probably need to take care this. 
Target Milestone: --- → mozilla1.2beta
what a hack. I have not touch mozilla code for 2 years. I didn't read these bugs
for 2 years. And they are still there. Just close them as won't fix to clean up.
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → WONTFIX
Mass Reassign Please excuse the spam
Assignee: ftang → nobody
Mass Re-opening Bugs Frank Tang Closed on Wensday March 02 for no reason, all
the spam is his fault feel free to tar and feather him
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Reassigning Franks old bugs to Jungshik Shin for triage - Sorry for spam
Assignee: nobody → jshin1987
Status: REOPENED → NEW
The bug can occur with a xml/rss/rdf file, and makes the file not available.
http://big5.xinhuanet.com/gate/big5/rss.xinhuanet.com/rss/mil.xml

Erreur d'analyse XML : balise ne correspondant pas. Attendu : </title>.
Emplacement : http://big5.xinhuanet.com/gate/big5/rss.xinhuanet.com/rss/mil.xml
Numéro de ligne 81, Colonne 132 :   
<comments>http://comments.xinhuanet.com/comment?url=http://news.xinhuanet.com/mil/2005-08/12/content_3343121.htm</comments>
 </item>
-----------------------------------------------------------------------------------------------------------------------------------^
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
  <item>
    <title>&#32249;&#65533;&#65533;&#65533;&#65533;&#32064;&#21700;&#21847;&#25652;&#21977;&#59042;U-2&#26946;&#27196;&#65533;&#65533;&#28186;&#65088;&#30274;&#37832;&#27946;&#59131;&#26891;&#65533;&#65533;0&#37723;&#12581;&#21246;(&#32199;&#21227;&#27992;)</title>
    <link>http://news.xinhuanet.com/mil/2005-08/12/content_3344604.htm</link>

    <description><![CDATA[

&#32249;&#65533;&#65533;&#65533;&#65533;&#32064;&#21700;&#21847;U-2&#26946;&#27196;&#65533;&#65533;&#28186;&#65088;&#30274;&#37832;&#65533;&#65533;
&#32249;&#65533;&#65533;&#65533;&#65533;&#32064;&#21700;&#21847;U-2&#26946;&#27196;&#65533;&#65533;&#28186;&#65088;&#30274;&#37832;&#65533;&#65533;&#65533;&#65533;&#21342;&#65533;&#65533;&#37510;&#23111;&#27131;&#37716;&#8451;&#26570;&#65533;&#65533;&#21295;&#65533;&#65533;2005&#39467;&#65533;&#65533;&#37832;&#65533;&#65533;&#37827;&#12518;&#23012;&#38316;
&#25777;&#65533;&#65533;2005&#39467;&#65533;&#65533;&#37832;&#65533;&#65533;&#37827;&#12527;&#32029;&#28051;&#65533;&#65533;&#28774;U-2"&#27051;&#27407;&#12467;"(Dragon Lady)&#26946;&#27196;&#65533;&#65533;&#28186;&#65088;&#30274;&#37832;&#34425;
&#65533;&#65533;&#32217;&#65533;&#65533;&#65533;&#65533;&#37711;&#12516;&#65533;&#65533;&#23052;&#35569;&#31801;&#23480;&#28853;&#32143;&#37413;&#65081;&#65533;&#65533;]]></description>
    <category>&#37712;&#28055;&#31784;&#37826;&#20276;&#26888;</category>
    <author>xinhuanet@xinhua.org</author>
    <pubDate>Fri, 12 Aug 2005 08:02:41 GMT</pubDate>
   
<comments>http://comments.xinhuanet.com/comment?url=http://news.xinhuanet.com/mil/2005-08/12/content_3344604.htm</comments>
 </item>
  <item>
QA Contact: amyy → i18n
Depends on: encoding_rs
This has been fixed at some point. Probably as part of security fixes.
Status: NEW → RESOLVED
Closed: 19 years ago7 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.