Closed Bug 47154 Opened 25 years ago Closed 25 years ago

Line breaks disappear when entities present in text (newlines are not converted to spaces) (Newline characters do not translate to whitespace) ([TEXT] Certain characters make Mozilla run words together)

Categories

(Core :: Internationalization, defect, P3)

x86
All
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: ian, Assigned: waterson)

References

()

Details

(4 keywords, Whiteboard: FIX IN HAND)

Attachments

(5 files)

Line breaks disappear to nothingness if we have a "—" character raw in the paragraph's text. That is character 151, which of course is invalid in ISO-8859-1 and HTML. <snip snide comments about suck.com's web design> TO REPRODUCE: See attached test case or visit the remarkably innaccurate and now infamous suck.com article: http://suck.com/daily/2000/07/31/daily.html ACTUAL RESULTS: Some words mash into each other. This happens at line breaks when character code 151 (0227, 0x97) is somewhere in that element. EXPECTED RESULTS: Well duh. We should not lose whitespace just because the paragraph contains some mystical undefined character. TESTED ON: Windows 2000 running commercial bits 6.0.17.2000072920. Note: This bug is proof that some good _did_ come from that article. And yes, the total useful content of that article boils down to a 2048 byte testcase...
Also exists on Linux, changing OS to all.
OS: Windows 2000 → All
Summary: Line breaks disappear when "—" present in text → Line breaks disappear when "?" present in text
Another test case: http://www.eleves.ens.fr:8080/home/madore/.test/test-spaces.html ...which uses the &mdash; entity instead of a raw 151 character, and is thus valid. Eek. Another testcase is coming up, showing this problem with &alpha;. This is therefore a VERY serious issue. It will, as far as I can tell, show up on EVERY page that includes entities, including many top100. It turns out that there are many dups of this. They will be marked shortly.
Keywords: compat4xp, top100
Summary: Line breaks disappear when "?" present in text → Line breaks disappear when entities present in text
Whiteboard: Serious legibility issue on many pages!
*** Bug 44519 has been marked as a duplicate of this bug. ***
From the about-to-be-duped bug 45367: ------- Additional Comments From ekrock@netscape.com 2000-07-27 18:14 ------- Endorsing for nsbeta3 stopper as this is a correctness of rendering bug on top100 sites that renders text unreadable (unlessyoudon'tmindreadingEnglishwithoutspaces).
Severity: minor → major
Keywords: regression
*** Bug 45367 has been marked as a duplicate of this bug. ***
This seems to depend on the entity. It works on a 2000803 build if I change &alpha to &pound. Perhaps only entities outside the current charset cause problems.
er. slightly more embarrassing is the fact that this happens on netscape's own NS6 marketing page (http://home.netscape.com/browsers/6/datasheet/index.html). Ian, maybe you could tell them how to fix it up...?
chrisn -- Apparently a regression is showing up on some of our marketing pages. FYI. Hopefully this will become [nsbeta3+] but be aware we may have site readability problems during nsbeta2.
I have a some most popular russian sites, where I see this problem. I think there will be many critics not fixing such a problem for second beta (espessially in Russia where NS PR1 was a disaster). :( Actually should be nsbeta2+... When it can't be fixed for nsbeta2, this should be in relnotes written with big-big font (and in russian too ;))) ).
Keywords: relnote
As I noted in the bug 44519, it isn't only issue with russian sites but with all international sites as well. It should be fixed as soon as possible!
Keywords: relnote2
But look at attachment in 47895. Is it cyrillic only?
Summary: Line breaks disappear when entities present in text → Line breaks disappear when entities present in text (newlines are not converted to spaces)
*** Bug 47495 has been marked as a duplicate of this bug. ***
Keywords: mostfreq
Summary: Line breaks disappear when entities present in text (newlines are not converted to spaces) → Line breaks disappear when entities present in text (newlines are not converted to spaces) (Newline characters do not translate to whitespace)
*** Bug 48452 has been marked as a duplicate of this bug. ***
Document Loaded: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"> <html lang="en"> <head> <title>Mozilla Tests: line breaks and perfectly valid entities</title> </head> <body> <p>first^M second &alpha;</p> </body> </html> ------------------------ Content model #1: docshell=00F0A800 html@02614068 refcount=3< head@026163C8 refcount=2< title@021F25B8 refcount=2< Text@021F6880 refcount=2<Mozilla Tests: line breaks and perfectly valid en tities> > > body@021F12E8 refcount=3< p@021A9A58 refcount=3< Text@021A9750 refcount=3<first\n\nsecond \u03b1> > > > Content model #2 ( after changing &alpha to &k ): docshell=00F0A800 html@021AD928 refcount=3< head@021AD838 refcount=2< title@021DA0D8 refcount=2< Text@021DA060 refcount=2<Mozilla Tests: line breaks and perfectly valid en tities> > > body@021DE838 refcount=3< p@02625E78 refcount=3< Text@02625C30 refcount=3<first\n\nsecond &k;> > > > Note: In both the cases ( refer the content model ) the line breaks are preserved by the parser. This is definitely a layout bug. Reassigning bug to waterson.
Assignee: rickg → waterson
The problem is that nsJISx4501LineBreaker doesn't think that ASCII 0x0a ('\n') is a space: http://lxr.mozilla.org/seamonkey/source/intl/lwbrk/src/nsJISx4501LineBreaker.cpp #211 It looks like this was changed by shanjian in r1.25: http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&ro ot=/cvsroot&subdir=mozilla/intl/lwbrk/src&command=DIFF_FRAMESET&file=nsJISx4501L ineBreaker.cpp&rev2=1.25&rev1=1.24
Status: NEW → ASSIGNED
Component: Parser → Internationalization
Attached patch proposed fixSplinter Review
The fix is to treat ASCII 0x0a and 0x0d ('\n' and '\r', respectively) as whitespace characters. shaijin/ftang, does this seem right?
Whiteboard: Serious legibility issue on many pages! → FIX IN HAND
Target Milestone: --- → M18
*** Bug 48234 has been marked as a duplicate of this bug. ***
Summary: Line breaks disappear when entities present in text (newlines are not converted to spaces) (Newline characters do not translate to whitespace) → Line breaks disappear when entities present in text (newlines are not converted to spaces) (Newline characters do not translate to whitespace) ([TEXT] Certain characters make Mozilla run words together)
Is this a dup of 46239 ?
fix checked in
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
CC'ed myself on this.
*** Bug 48167 has been marked as a duplicate of this bug. ***
*** Bug 49492 has been marked as a duplicate of this bug. ***
*** Bug 48164 has been marked as a duplicate of this bug. ***
*** Bug 49666 has been marked as a duplicate of this bug. ***
*** Bug 49812 has been marked as a duplicate of this bug. ***
*** Bug 48828 has been marked as a duplicate of this bug. ***
verified: 2000-09-05-08-M18 : win32 2000-09-05-08-M18 : linux 2000-09-05-08-M18 : mac
Status: RESOLVED → VERIFIED
*** Bug 51553 has been marked as a duplicate of this bug. ***
*** Bug 49389 has been marked as a duplicate of this bug. ***
*** Bug 54877 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: