Open Bug 535485 Opened 15 years ago Updated 2 years ago

Line break character in text should not be rendered as a space in Japanese and other languages that don't use spaces

Categories

(Core :: Layout: Block and Inline, defect)

defect

Tracking

()

UNCONFIRMED

People

(Reporter: s-yukikura, Unassigned)

Details

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.3a1pre) Gecko/20091216 Minefield/3.7a1pre
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.3a1pre) Gecko/20091216 Minefield/3.7a1pre

Consider the following code :

=================

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html lang="ja"><head>
  
  <meta content="text/html; charset=UTF-8" http-equiv="content-type">
  <title>日本語</title>

  
</head><body>
文書のレイアウトを決める際、テキストの入る部分はべた塗りや記号にするよりは実際の出来上がりに近いフォントによる文章を入れた方が完成時の姿を想像し
やすい。しかし一方で、文章が入ると文書全体のデザインよりも文章の内容の方に意識が集中してしまう。そこで欧米などの出版業界やデザイン業界ではタイポ
グラフィやレイアウトにプレゼンテーションの焦点を当てるため、意味の全くない文字の羅列をテキスト部分に流し込む。
</body></html>

=================

Due to the line breaks, Firefox puts spaces where there should have no spaces (Japanese language has no spaces).



Reproducible: Always

Steps to Reproduce:
1. View the above code in Firefox.
Actual Results:  
想像しやすい is displayed as 想像し やすい
タイポグラフィ is displayed as タイポ グラフィ

Expected Results:  
Solution:
Firefox should check the language of the document, either in the html tag or meta tag. There are many languages that naturally don't use spaces (ex: Chinese, Japanese, Korean), so line break should not be rendered as a space in these. However it could be a problem if there is 'foreign' text in... Like English text in between...

IE doesn't seem to have any problem with all of that, though.

The file was created by Kompozer, by the way. Remove the line breaks and the text displays fine. But I don't think that Kompozer is doing anything wrong there.
Attached file Test case
The code in the description.
Component: General → HTML: Parser
Product: Firefox → Core
QA Contact: general → parser
The parser works according to the spec here, so this doesn't belong in the parser.

Maybe there's a layout-level control for collapsing the spaces. However, the default behavior of not making layout space collapsing sensitive to the character class of the adjacent characters seems reasonable.
Component: HTML: Parser → Layout: Block and Inline
QA Contact: parser → layout.block-and-inline
http://www.w3.org/TR/2003/CR-css3-text-20030514/#linefeed-treatment described a mechanism for this, but it's been removed in newer drafts.  I'm not sure what current thoughts on fixing this in the specs are.
Are you guys sure the problem is in the Core? I tried sending myself the test case as an attachment with Thunderbird, and it displays fine there.
Now affecting Thunderbird as well, for emails written from within the application.

On the other hand if I attach the test case and send it to myself in Thunderbird, it will both put an unwanted space and a line feed.

Why is this bug still unconfirmed?
I filed a separate bug for Thunderbird: bug 704441.

By the way, there is a work-around in Kompozer to create HTML with no line breaks: 
http://sourceforge.net/tracker/?func=detail&atid=853122&aid=1831943&group_id=170132
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: