Closed Bug 249159 Opened 21 years ago Closed 13 years ago

implement 'word-break' properties of CSS3

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla15

Tracking Flags:

Tracking

Status

blocking-fennec1.0

---

People

(Reporter: jshin1987, Assigned: m_kato)

References

(Blocks 2 open bugs,
URL
)

Details

(Keywords: css3, dev-doc-complete, intl, Whiteboard: [parity-IE][parity-webkit])

Attachments

(7 files, 12 obsolete files)

patch v0 16 years ago Makoto Kato [:m_kato] 25.29 KB, patch		Details \| Diff \| Splinter Review
patch v1 16 years ago Makoto Kato [:m_kato] 25.60 KB, patch		Details \| Diff \| Splinter Review
patch v2 16 years ago Makoto Kato [:m_kato] 25.44 KB, patch	smontagu : review-	Details \| Diff \| Splinter Review
patch v2.1 16 years ago Makoto Kato [:m_kato] 26.07 KB, patch		Details \| Diff \| Splinter Review
patch v2.2 16 years ago Makoto Kato [:m_kato] 26.01 KB, patch		Details \| Diff \| Splinter Review
patch v2.3 15 years ago Makoto Kato [:m_kato] 26.74 KB, patch		Details \| Diff \| Splinter Review
patch v3 15 years ago Makoto Kato [:m_kato] 26.75 KB, patch		Details \| Diff \| Splinter Review
patch v4 15 years ago Makoto Kato [:m_kato] 42.70 KB, patch	dbaron : review+	Details \| Diff \| Splinter Review
Part 1 - Add word-break interface to nsILineBreaker 14 years ago Makoto Kato [:m_kato] 13.85 KB, patch	smontagu : review+	Details \| Diff \| Splinter Review
Part 2 - Implement word-break property 14 years ago Makoto Kato [:m_kato] 20.35 KB, patch		Details \| Diff \| Splinter Review
Part 2. Implement word-break property v2 14 years ago Makoto Kato [:m_kato] 19.45 KB, patch		Details \| Diff \| Splinter Review
Part 1 - Add word-break interface to nsILineBreaker v2 13 years ago Makoto Kato [:m_kato] 13.06 KB, patch	smontagu : review+	Details \| Diff \| Splinter Review
Part 2 - Implement word-break property v3 13 years ago Makoto Kato [:m_kato] 19.47 KB, patch		Details \| Diff \| Splinter Review
Some more reftests 13 years ago Simon Montagu :smontagu 7.03 KB, patch		Details \| Diff \| Splinter Review
Even more reftests 13 years ago Simon Montagu :smontagu 10.49 KB, patch	jfkthame : review+	Details \| Diff \| Splinter Review
Part 3 -- fix for mixed 8-bit and 16-bit text 13 years ago Simon Montagu :smontagu 1.09 KB, patch	jfkthame : review+	Details \| Diff \| Splinter Review
part 4 - harfbuzz shaper should not modify character-clustering info due to diacritic within ligature 13 years ago Jonathan Kew [:jfkthame] 2.57 KB, patch	smontagu : review+	Details \| Diff \| Splinter Review
modification to wordbreak-7 test 13 years ago Jonathan Kew [:jfkthame] 7.65 KB, patch	smontagu : review+	Details \| Diff \| Splinter Review
roll-up patch for Aurora 13 years ago Jonathan Kew [:jfkthame] 49.37 KB, patch	jfkthame : review+ jpr : approval-mozilla-aurora-	Details \| Diff \| Splinter Review

Jungshik Shin

Reporter

Description

•

21 years ago

http://www.w3.org/TR/2003/CR-css3-text-20030514/#wordbreak-props CSS3 text module defines word-break (word-break-cjk and word-break-inside) properties, but we haven't implemented them yet. Implementing them should be a part of overhauling/improving our line-breaking routines.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 1

•

21 years ago

Is there a good reason for implementing these? Why would anyone want word-break-cjk? Isn't the default value "the right way" and everything else "the wrong way"? And I don't think we're about to implement dictionary-based hyphenation. I vote WONTFIX (just like most of css3-text).

Jungshik Shin

Reporter

Comment 2

•

21 years ago

In CJK typography, 'the wrong' way ('break-all') is sometimes (actually pretty often) used for a short chunk of Latin text. A Korean user asked for it (writing that it's possible in MS IE) at mozilla.or.kr and I filed this bug. As for the dictionary-based hyphenating, it's needless to say it's a lot of work and it won't happen anytime soon, but it doesn't have to be 'all or nothing', does it?

Anne (:annevk)

Updated

•

21 years ago

Blocks: css-text-3

Keywords: css3

Jungshik Shin

Reporter

Updated

•

20 years ago

Blocks: line-breaking

Xiaoyu Huang

Comment 3

•

20 years ago

It's really a bug. Though it's a small problem but there're many small problems like this in mozilla for CJK users. So they use IE not mozilla. I vote fix it.

fantasai

Comment 4

•

20 years ago

The word-breaking section of CSS3 Text is unstable and the definitions therein may be changed. Fixing this bug right now is therefore discouraged.

Jungshik Shin

Reporter

Comment 5

•

20 years ago

Don't worry :-) We don't have any resource to work on this even if we think we have to/want to. There are a lot more important bugs related to line breaking (see bug 206152) than this that we need to fix.

Thangaraj

Comment 6

•

18 years ago

Is there any Way or Workaround to do this in Firefox????

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

18 years ago

Assignee: dbaron → nobody

QA Contact: ian → style-system

Arthit Suriyawongkul

Updated

•

17 years ago

Depends on: grapheme-breaker

Syophone

Comment 8

•

17 years ago

(In reply to comment #6) > Is there any Way or Workaround to do this in Firefox???? > NO,as bug 99457.

Makoto Kato [:m_kato]

Assignee

Comment 9

•

16 years ago

I look this. At latest draft (http://www.w3.org/TR/2007/WD-css3-text-20070306/), word-break-cjk and word-break-inside is removed.

Assignee: nobody → m_kato

URL: http://www.w3.org/TR/css3-text/#word-...

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 10

•

16 years ago

Before implementing, you should ask fantasai what is stable.

fantasai

Comment 11

•

16 years ago

The Editor's Draft is here: http://dev.w3.org/csswg/css3-text/#word-break The functionality described there should be stable, but the syntax is not. Microsoft has already implemented word-break: normal | break-all | keep-all without a vendor extension, so those values should be safe. The other values should be implemented with a -moz- prefix for now.

Makoto Kato [:m_kato]

Assignee

Comment 12

•

16 years ago

Attached patch patch v0 (obsolete) — Details — Splinter Review

Now I'm still testing this patch. After testing, I will send code-review.

Makoto Kato [:m_kato]

Assignee

Comment 13

•

16 years ago

Attached patch patch v1 (obsolete) — Details — Splinter Review

patch with test case

Attachment #338626 - Attachment is obsolete: true

Makoto Kato [:m_kato]

Assignee

Comment 14

•

16 years ago

fantasai, Do you have a good test case for this?

fantasai

Comment 15

•

16 years ago

No, I don't. If you make yours match the CSS Test Suite guidelines, though, I'd love to add them to the W3C's collection. :) http://wiki.csswg.org/test/css2.1/format

Makoto Kato [:m_kato]

Assignee

Comment 16

•

16 years ago

Attached patch patch v2 (obsolete) — Details — Splinter Review

Attachment #344881 - Attachment is obsolete: true

Makoto Kato [:m_kato]

Assignee

Updated

•

16 years ago

Attachment #355280 - Flags: review?(smontagu)

Masayuki Nakano [:masayuki] (he/him)(JST, +0900)

Comment 17

•

16 years ago

Kato-san, I think that you need to add word-break property to property-database.js too. http://mxr.mozilla.org/mozilla-central/source/layout/style/test/property_database.js Thank you for your work!

Simon Montagu :smontagu

Updated

•

16 years ago

Attachment #355280 - Flags: review?(smontagu) → review-

Simon Montagu :smontagu

Comment 18

•

16 years ago

Comment on attachment 355280 [details] [diff] [review] patch v2 I'm not qualified to review changes to style system, but I'm fairly sure that there are some parts missing here (apart from what Masayuki-san already mentioned). You should complete all the steps in https://developer.mozilla.org/En/Adding_a_new_style_property and make sure that mochitests pass in layout/style.

Makoto Kato [:m_kato]

Assignee

Comment 19

•

16 years ago

Attached patch patch v2.1 (obsolete) — Details — Splinter Review

Attachment #355280 - Attachment is obsolete: true

Makoto Kato [:m_kato]

Assignee

Comment 20

•

16 years ago

Comment on attachment 360925 [details] [diff] [review] patch v2.1 pass all test

Attachment #360925 - Flags: review?(smontagu)

Masayuki Nakano [:masayuki] (he/him)(JST, +0900)

Comment 21

•

16 years ago

Simon, would you continue to review?

Makoto Kato [:m_kato]

Assignee

Comment 22

•

16 years ago

Attached patch patch v2.2 (obsolete) — Details — Splinter Review

Attachment #360925 - Attachment is obsolete: true

Attachment #386789 - Flags: review?(smontagu)

Attachment #360925 - Flags: review?(smontagu)

Updated

•

16 years ago

Flags: wanted1.9.2?

Flags: blocking1.9.2?

Robert O'Callahan (:roc) (email my personal email if necessary)

Updated

•

15 years ago

Flags: wanted1.9.2?

Flags: wanted1.9.2-

Flags: blocking1.9.2?

Flags: blocking1.9.2-

Makoto Kato [:m_kato]

Assignee

Updated

•

15 years ago

Attachment #386789 - Flags: review?(smontagu)

Makoto Kato [:m_kato]

Assignee

Comment 24

•

15 years ago

Attached patch patch v2.3 (obsolete) — Details — Splinter Review

Update with latest. If you have no time to review this, please let me know.

Attachment #386789 - Attachment is obsolete: true

Attachment #412820 - Flags: review?(smontagu)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 25

•

15 years ago

Comment on attachment 412820 [details] [diff] [review] patch v2.3 >diff --git a/layout/style/nsCSSPropList.h b/layout/style/nsCSSPropList.h > CSS_PROP_TEXT( >+ word-break, >+ word_break, >+ WordBreak, >+ 0, >+ Text, >+ mWordBreak, >+ eCSSType_Value, >+ kWordbreakKTable, >+ CSS_PROP_NO_OFFSET, >+ eStyleAnimType_None) You should make the last two lines offsetof(nsStyleText, mWordBreak) and eStyleAnimType_EnumU8. Then nsStyleAnimation will be able to animate the property (even if nothing takes advantage of it yet).

Makoto Kato [:m_kato]

Assignee

Comment 26

•

15 years ago

Attached patch patch v3 (obsolete) — Details — Splinter Review

Attachment #412820 - Attachment is obsolete: true

Attachment #412820 - Flags: review?(smontagu)

Makoto Kato [:m_kato]

Assignee

Comment 27

•

15 years ago

change title because current spec no longer uses word-break-cjk and word-break-inside. Also, WebKit has implemented "word-break: break-all", and IE has implemented "word-break: break-all" and "keep-all".

Summary: implement 'word-break' (word-break-cjk, word-break-inside) properties of CSS3 → implement 'word-break' properties of CSS3

Makoto Kato [:m_kato]

Assignee

Comment 28

•

15 years ago

Comment on attachment 415098 [details] [diff] [review] patch v3 modified by dbaron's comment.

Attachment #415098 - Flags: review?(smontagu)

Comment 30

•

15 years ago

How is progress on this?

Makoto Kato [:m_kato]

Assignee

Comment 31

•

15 years ago

(In reply to comment #30) > How is progress on this? Simon is no reply for review.

Simon Montagu :smontagu

Comment 32

•

15 years ago

Comment on attachment 415098 [details] [diff] [review] patch v3 >diff --git a/content/base/src/nsLineBreaker.cpp b/content/base/src>@@ -81,11 +83,13 @@ nsLineBreaker::FlushCurrentWord() > nsTArray<PRPackedBool> capitalizationState; > > if (!mCurrentWordContainsComplexChar) { >- // Just set everything internal to "no break"! >- memset(breakState.Elements(), PR_FALSE, length*sizeof(PRPackedBool)); >+ // Just set everything internal to "no break" except to break-strict! "For break-strict set everything internal to "break", otherwise to "no break" >@@ -417,3 +425,31 @@ nsLineBreaker::Reset(PRBool* aTrailingBr >+nsresult >+nsLineBreaker::SetBreakMode(PRUint8 aMode) >+{ >+ switch (aMode) >+ { ... >+ default: >+ return NS_ERROR_FAILURE; don't we want to default to "normal"? >diff --git a/intl/lwbrk/src/nsJISx4501LineBreaker.cpp b/intl/lwbrk >+const PRUint8 gLoose30[] = Can you document which characters this contains and why? Is it correct that it doesn't include the "small katakana" letters from U+31F0-U+31FF? In the test later for half-width katakana, what about U+FF70 HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK, which seems to be equivalent to U+30FC? >+static PRBool >+IS_LOOSE_CHAR(PRUnichar aChar) >+{ >+ PRUnichar hi = aChar & 0xff00; >+ PRUnichar lo = aChar & 0xff; >+ >+ if (hi == 0x3000) >+ { >+ return (gLoose30[lo / 8] >> (7 - (lo % 8))) & 0x01; >+ } >+ else if (hi == 0xff00) Don't do else after return >+// A helper function of word-break property >+static void >+ShouldBreakByBreakMode(PRUnichar aChar, PRUint8 aMode, PRBool *aAllow) >+{ >+ // Check word-break style >+ if (aMode == nsILineBreaker::kBreakMode_Normal) >+ return; >+ >+ if (IS_CJK_CHAR(aChar)) >+ { >+ if (aMode & nsILineBreaker::kBreakMode_CJK_Keep) >+ *aAllow = PR_FALSE; >+ else if (aMode & nsILineBreaker::kBreakMode_CJK_Loose && IS_LOOSE_CHAR(aChar)) >+ *aAllow = PR_TRUE; >+ } else { >+ if (aMode & nsILineBreaker::kBreakMode_BreakStrict) >+ *aAllow = PR_TRUE; >+ } >+} More documentation here would be good. I'm not sure what is happening in the cases not covered by the conditions. For future iterations, can you separate the style changes and the linebreaker changes into separate patches, and request review from dbaron for the style and from me for the linebreaker.

Makoto Kato [:m_kato]

Assignee

Comment 33

•

15 years ago

Attached patch patch v4 (obsolete) — Details — Splinter Review

Attachment #415098 - Attachment is obsolete: true

Attachment #415098 - Flags: review?(smontagu)

Makoto Kato [:m_kato]

Assignee

Updated

•

15 years ago

Attachment #453321 - Flags: review?(dbaron)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 34

•

15 years ago

Comment on attachment 453321 [details] [diff] [review] patch v4 nsStyleConsts.h: >+// See nsStaleText s/Stale/Style/ >diff --git a/layout/html/tests/css/word-break.css b/layout/html/tests/css/word-break.css Don't add this file; it doesn't do anything. Would it make more sense for nsILineBreaker::SetBreakMode (and nsLineBreaker::mBreakMode) to be called SetWordBreak or SetWordBreakMode, since the property is called word-break? Did you check that all of the reftests fit within the 800x1000 tested area? (You can check by changing them to != tests and looking at the image.) In nsCSSProps, kWordbreakKTable should be kWordBreakKTable (with a capital B), to match surrounding style. If you don't mind, could you also make the same change to kWordwrapKTable -> kWordWrapKTable. In nsComputedDOMStyle.cpp, you don't need the >+ if (text->mWordBreak != NS_STYLE_WORDBREAK_NORMAL) { ... >+ } else { >+ val->SetIdent(eCSSKeyword_normal); >+ } (You only need the code where I put "...".) Have you tested that the implementation of the three unprefixed values is compatible with what IE implements? With those changes, r=dbaron on the dom/ and layout/ changes. smontagu should review the content/ and intl/ and layout/reftests/ changes.

Attachment #453321 - Flags: review?(smontagu)

Attachment #453321 - Flags: review?(dbaron)

Attachment #453321 - Flags: review+

j.j.

Updated

•

15 years ago

Keywords: dev-doc-needed

mani

Comment 35

•

14 years ago

can i know the status of this fix ? its very much needed for proper display of lengthy words. Except FF, all the other major browsers like IE, safari, chrome are supporting this word-break feature.

Makoto Kato [:m_kato]

Assignee

Comment 36

•

14 years ago

(In reply to comment #35) > can i know the status of this fix ? its very much needed for proper display of > lengthy words. Except FF, all the other major browsers like IE, safari, chrome > are supporting this word-break feature. This isn't blocker for 4.0 Also, since editor draft for work-break is modified, I will create new fix after post 4.0 tree is opened.

Makoto Kato [:m_kato]

Assignee

Comment 37

•

14 years ago

Attached patch Part 1 - Add word-break interface to nsILineBreaker (obsolete) — Details — Splinter Review

Attachment #453321 - Attachment is obsolete: true

Attachment #522332 - Flags: review?(smontagu)

Attachment #453321 - Flags: review?(smontagu)

Makoto Kato [:m_kato]

Assignee

Comment 38

•

14 years ago

Attached patch Part 2 - Implement word-break property (obsolete) — Details — Splinter Review

Makoto Kato [:m_kato]

Assignee

Updated

•

14 years ago

URL: http://www.w3.org/TR/css3-text/#word-... → http://dev.w3.org/csswg/css3-text/#wo...

Whiteboard: [parity-IE]

Simon Montagu :smontagu

Updated

•

14 years ago

Attachment #522332 - Flags: review?(smontagu) → review+

Makoto Kato [:m_kato]

Assignee

Comment 39

•

14 years ago

Attached patch Part 2. Implement word-break property v2 (obsolete) — Details — Splinter Review

Attachment #522333 - Attachment is obsolete: true

Makoto Kato [:m_kato]

Assignee

Updated

•

14 years ago

Attachment #540007 - Attachment is patch: true

Makoto Kato [:m_kato]

Assignee

Comment 40

•

14 years ago

(In reply to comment #34) > Have you tested that the implementation of the three unprefixed values > is compatible with what IE implements? IE9 supports both break-all and keep-all. About break-all, this is same implementation. But IE9's keep-all implementation is strange. Example, U+3001 is break character even if keep-all. (U+3002 doesn't break even if keep-all.) WebKit supports break-all. It is same implementation.

Makoto Kato [:m_kato]

Assignee

Comment 41

•

13 years ago

Attached patch Part 1 - Add word-break interface to nsILineBreaker v2 — Details — Splinter Review

rebase with latest

Attachment #522332 - Attachment is obsolete: true

Attachment #555997 - Flags: review?(smontagu)

Makoto Kato [:m_kato]

Assignee

Comment 42

•

13 years ago

Attached patch Part 2 - Implement word-break property v3 — Details — Splinter Review

rebase with the latest

Attachment #540007 - Attachment is obsolete: true

Attachment #555998 - Flags: review?(dbaron)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 43

•

13 years ago

Comment on attachment 555998 [details] [diff] [review] Part 2 - Implement word-break property v3 Why are you requesting review again? I granted review in comment 34 (conditional on changes described there). If this is different in some way that it needs to be reviewed again, please explain how.

Attachment #555998 - Flags: review?(dbaron)

Mounir Lamouri (:mounir)

Updated

•

13 years ago

Whiteboard: [parity-IE] → [parity-IE][parity-webkit]

Mounir Lamouri (:mounir)

Updated

•

13 years ago

Blocks: ringmark-ring0

Jet Villegas (inactive)

Comment 44

•

13 years ago

smontagu: please review the r? items and land this patch.

Kang-Hao (Kenny) Lu [:kennyluck]

Comment 45

•

13 years ago

(In reply to Jet Villegas (:jet) from comment #44) > smontagu: please review the r? items and land this patch. To be clear, it seems what's left is the content/ and intl/ and layout/reftests/ parts in Part 2 that should be reviewed by smontagu. (Part 1 already got a r+ and Part 2 got partial r+ from Comment 34)

Status: NEW → ASSIGNED

Jet Villegas (inactive)

Comment 46

•

13 years ago

These patches have been updated and are winding their way through the try servers now...

Simon Montagu :smontagu

Comment 47

•

13 years ago

Attached patch Some more reftests (obsolete) — Details — Splinter Review

Attachment #620239 - Flags: review?(jfkthame)

Simon Montagu :smontagu

Comment 48

•

13 years ago

Attached patch Even more reftests — Details — Splinter Review

Attachment #620239 - Attachment is obsolete: true

Attachment #620311 - Flags: review?(jfkthame)

Attachment #620239 - Flags: review?(jfkthame)

Jonathan Kew [:jfkthame]

Comment 49

•

13 years ago

Comment on attachment 620311 [details] [diff] [review] Even more reftests Review of attachment 620311 [details] [diff] [review]: ----------------------------------------------------------------- The reftest files should all have <!DOCTYPE html> at the top, I think. ::: layout/reftests/text/wordbreak-7-ref.html @@ +12,5 @@ > + </style> > + <title>Test - word-break: break-all with rtl and diacritics</title> > + </head> > + <body> > + <div>وَ<br/>لَا<br/>ا<br/>ﻟ<br/>ﻀَّ<br/>ﺎ<br/>ﻟِّ<br/>ﻴ<br/>ﻦَ</div><br/> <br/>لَا<br/> This implies that we don't break between characters that form a ligature (lam-alef). That seems surprising; although it'll look bad, I'd expect this to be a legal break position. What about in Latin? Do we break between "f" and "i" in "fire" (in a typical serif font with standard ligatures)?

Simon Montagu :smontagu

Comment 50

•

13 years ago

(In reply to Jonathan Kew (:jfkthame) from comment #49) > This implies that we don't break between characters that form a ligature > (lam-alef). That seems surprising; although it'll look bad, I'd expect this > to be a legal break position. FWIW, IE doesn't break a lam-alef ligature, but Webkit does. Chrome on Windows (but not on Mac) also breaks between base letters and diacritics, which I'm sure is wrong; between Devanagari consonants and matras; and between every character in NFD Hangul. The last two I suppose are permissible by the letter of the spec, but (IMNSHO) that just means that the spec should be changed. > What about in Latin? Do we break between "f" > and "i" in "fire" (in a typical serif font with standard ligatures)? Answered on IRC -- summarizing here for the record: We don't (although we do exhibit behaviour similar to bug 479829)

Simon Montagu :smontagu

Comment 51

•

13 years ago

Sorry, we *do* break between "f" and "i" in "fire"

Jonathan Kew [:jfkthame]

Comment 52

•

13 years ago

(In reply to Simon Montagu from comment #50) > (In reply to Jonathan Kew (:jfkthame) from comment #49) > > This implies that we don't break between characters that form a ligature > > (lam-alef). That seems surprising; although it'll look bad, I'd expect this > > to be a legal break position. > > FWIW, IE doesn't break a lam-alef ligature, but Webkit does. Chrome on > Windows (but not on Mac) also breaks between base letters and diacritics, > which I'm sure is wrong; Indeed. > between Devanagari consonants and matras; and > between every character in NFD Hangul. The last two I suppose are > permissible by the letter of the spec, but (IMNSHO) that just means that the > spec should be changed. I don't think the Devanagari case is permissible per spec, because the matras are not "letters" as defined in CSS3-Text (characters with GC=L* or N*); they're combining marks (GC=Mc or Mn). The Hangul NFD breaks would be allowed, presumably, though they shouldn't be.

Jonathan Kew [:jfkthame]

Comment 53

•

13 years ago

The patches here result in an unwanted dependency on diacritics. In this example: data:text/html;charset=utf-8,<div style="width:0px; word-break:break-all">لَام السلَام the lam-alef ligatures (which have a diacritic on the lam) do not break; however, in data:text/html;charset=utf-8,<div style="width:0px; word-break:break-all">لام السلام where the diacritics are omitted, a break is allowed between the lam and alef.

Jonathan Kew [:jfkthame]

Comment 54

•

13 years ago

Another oddity, mentioned on IRC but adding it here so we don't lose it... data:text/html;charset=utf-8,<div style="width:0px; word-break:break-all">fire f̣ire The first "fire" (without the added combining mark under "f") doesn't break at all.

Jonathan Kew [:jfkthame]

Comment 55

•

13 years ago

And one more example: data:text/html;charset=utf-8,<div style="width:0px; word-break:break-all">لاَم (i.e. with a diacritic on the alef of the lam-alef ligature). In this case, the lam-alef stays together but then we break before the diacritic.

Jonathan Kew [:jfkthame]

Comment 56

•

13 years ago

data:text/html;charset=utf-8,<div style="width:0px; word-break:break-all">fi̥re (where the font has an "fi" ligature, but not an "i̥" composite).... renders the entire fi-ligature plus ̥ diacritic on the second line, with a blank line before it.

Simon Montagu :smontagu

Comment 57

•

13 years ago

Attached patch Part 3 -- fix for mixed 8-bit and 16-bit text — Details — Splinter Review

This fixes the issue described in commment 54

Attachment #620689 - Flags: review?(jfkthame)

Jonathan Kew [:jfkthame]

Comment 58

•

13 years ago

Here's another unexpected interaction - compare: data:text/html;charset=utf-8,<div style="width:0px; word-break:break-all">ab̥c (which works as expected) with: data:text/html;charset=utf-8,<div style="text-transform:lowercase; width:0px; word-break:break-all">ab̥c In the latter case, the ̥ gets split onto its own line.

Jonathan Kew [:jfkthame]

Updated

•

13 years ago

Attachment #620689 - Flags: review?(jfkthame) → review+

Jet Villegas (inactive)

Comment 59

•

13 years ago

Nominating for Fennec tracking...

blocking-fennec1.0: --- → ?

Jonathan Kew [:jfkthame]

Comment 60

•

13 years ago

Comment on attachment 620311 [details] [diff] [review] Even more reftests Given the odd behavior of the Arabic lam-alef ligature (whether we break depends on the presence/absence of a diacritic), I think we should drop that testcase for now. If we don't figure out and fix the behavior there before landing (which I don't think is critical), let's have a followup bug to deal with it. So r=me for the tests with #7 removed. And please add <!DOCTYPE html> to them, just because it's the Right Thing to avoid quirks mode, although I don't think it affects how they behave.

Attachment #620311 - Flags: review?(jfkthame) → review+

Jonathan Kew [:jfkthame]

Comment 61

•

13 years ago

I figured out the source of the issue described in comment #53; this is not an issue with the word-break patches, but a result of how our harfbuzz glue deals with character/glyph associations. In the simple lam-alef case, we have two characters that result in a single glyph, which is associated with the first character position; the second has no glyph, but is marked as a ligature continuation. But each character is still considered a valid cluster-start and so a break can happen (and we then draw half the ligature on each line, which is ugly, but that's a separate issue...) However, in the lam-fatha-alef case, we have three characters that map to two glyphs, but the first glyph is associated with the character offsets 0 and 2, and the second glyph is associated with the character offset 1. In gfxHarfBuzzShaper, we see the discontiguous character range for the first glyph, and extend our cluster to "fill in" the gap, so that we end up with a cluster that has three characters and two glyphs, both stored in a DetailedGlyphs list at the first character position, with the fatha and alef both marked as continuations of the cluster and ligature. It's not clear to me that gfxHarfBuzzShaper actually needs to modify the cluster information in this case, however. I think this was originally done so that Indic examples involving reordering (e.g. Devanagari short-i) would behave properly (if हि were not treated as a single cluster, the user would appear to be able to select the individual characters, but the actual underlying selection would not match the apparent highlighting). However, I don't think clustering based on detection of reordering in gfxHarfBuzzShaper is in fact needed for that, because the Indic vowel matras are already category M*, and so will be cluster-extending characters regardless of reordering; the standard SetupClusterBoundaries should have handled this adequately. Therefore, I think we may be able to remove this reordering-detection and resulting cluster-formation from gfxHarfBuzzShaper, and that will resolve the discrepancy in break-all behavior between lam-alef and lam-fatha-alef. (The result will be that in both cases, a break before the alef can occur, and they'll look pretty bad. But that's just the existing problem of line-breaking within ligatures, as in bug 479829; in certain cases, we need to re-shape at line breaks.) I'm experimenting with a patch to fix the clustering issue here, and will post it if it doesn't seem to cause other problems.

Jonathan Kew [:jfkthame]

Comment 62

•

13 years ago

Attached patch part 4 - harfbuzz shaper should not modify character-clustering info due to diacritic within ligature — Details — Splinter Review

This fixes the lam-fatha-alef clustering issue, so that the line-breaking behavior is not affected by the presence of diacritics. It will make the wordbreak-7 testcase fail, but as already noted, I think that test is wrong and should be dropped for now. We can't just "fix" the reference there because of the need to re-shape when line breaks occur within ligatures (which we don't support). (You could have an Arabic testcase that doesn't include lam-alef ligatures, I suppose; that might be worthwhile for now, so that we've at least got RTL test coverage here.)

Attachment #621070 - Flags: review?(smontagu)

Jonathan Kew [:jfkthame]

Updated

•

13 years ago

Attachment #621070 - Attachment is patch: true

JP Rosevear [:jpr]

Comment 63

•

13 years ago

Jet, we need a better rationale for why this is a blocker. Please renom with that.

blocking-fennec1.0: ? → +

Jonathan Kew [:jfkthame]

Comment 64

•

13 years ago

Attached patch modification to wordbreak-7 test — Details — Splinter Review

This modifies the wordbreak-7 test from "Even more reftests", splitting it into two testcases - 7a, without any lam-alef ligatures, which passes, and 7b that includes ligatures and is marked as known-fail due to bug 479829, but has the appropriate reference rendering in preparation for the day when we address that issue.

Attachment #621272 - Flags: review?(smontagu)

Simon Montagu :smontagu

Updated

•

13 years ago

Attachment #621070 - Flags: review?(smontagu) → review+

Simon Montagu :smontagu

Updated

•

13 years ago

Attachment #621272 - Flags: review?(smontagu) → review+

Simon Montagu :smontagu

Comment 65

•

13 years ago

Comment on attachment 555997 [details] [diff] [review] Part 1 - Add word-break interface to nsILineBreaker v2 Review of attachment 555997 [details] [diff] [review]: ----------------------------------------------------------------- r=me with a few changes. I'll check in the whole patchset from this bug including the changes. ::: content/base/public/nsLineBreaker.h @@ +205,5 @@ > + * Set word-break mode for linebreaker. This is set by word-break property. > + * @param aMode is nsILineBreaker::kBreakMode_* value. > + */ > + void SetBreakMode(PRUint8 aMode) { mBreakMode = aMode; } > + As dbaron already suggested, s/BreakMode/WordBreak/ throughout ::: content/base/src/nsLineBreaker.cpp @@ +87,5 @@ > nsTArray<PRPackedBool> capitalizationState; > > if (!mCurrentWordContainsComplexChar) { > + // For break-strict set everything internal to "break", otherwise to > + // just set everything internal to "no break"! // For break-strict set everything internal to "break", otherwise // to "no break"! @@ +407,5 @@ > PRBool isBreakableSpace = isSpace && !(aFlags & BREAK_SUPPRESS_INSIDE); > > if (aSink) { > + // Consider word-break style. Since the break position of CJK scripts > + // will set by nsILineBreaker, we don't consider CJK at this point. "will be set" ::: intl/lwbrk/public/nsILineBreaker.h @@ +47,4 @@ > #define NS_ILINEBREAKER_IID \ > { 0x5ae68851, 0xd9a3, 0x49fd, \ > { 0x93, 0x88, 0x58, 0x58, 0x6d, 0xad, 0x80, 0x44 } } > rev the interface. ::: intl/lwbrk/src/nsJISx4501LineBreaker.cpp @@ +553,5 @@ > +// as JIS X4051. > +// If break rule is changed by mode, set it to aAllow > + > +static PRBool > +ShouldBreakByBreakMode(PRUint8 aMode, PRBool aAllow) The helper function isn't necessary -- see below. @@ +877,1 @@ > PRBool allowBreak; This can be done more simply without the helper function, like this: - bool allowBreak; + bool allowBreak = false; if (cur > 0) { NS_ASSERTION(CLASS_COMPLEX != lastClass || CLASS_COMPLEX != cl, "Loop should have prevented adjacent complex chars here"); - if (state.UseConservativeBreaking()) - allowBreak = GetPairConservative(lastClass, cl); - else - allowBreak = GetPair(lastClass, cl); - } else { - allowBreak = false; + if (aWordBreak == nsILineBreaker::kWordBreak_Normal) { + allowBreak = (state.UseConservativeBreaking()) ? + GetPairConservative(lastClass, cl) : GetPair(lastClass, cl); + } else if (aWordBreak == nsILineBreaker::kWordBreak_BreakAll) { + allowBreak = true; + } @@ +905,5 @@ > + // We have to consider word-break value again for complext characters > + if (aBreakMode != nsILineBreaker::kBreakMode_Normal) { > + // Respect word-break property > + for (PRUint32 i = cur; i < end; i++) > + aBreakBefore[i] = ShouldBreakByBreakMode(aBreakMode, aBreakBefore[i]); This too can be simplified to aBreakBefore[i] = (aWordBreak == nsILineBreaker::kWordBreak_BreakAll) since we are already inside if (aBreakMode != nsILineBreaker::kBreakMode_Normal) {} @@ +944,3 @@ > } > > PRBool allowBreak; this can be simplified in the same way as the parallel method above

Attachment #555997 - Flags: review?(smontagu) → review+

Simon Montagu :smontagu

Comment 66

•

13 years ago

Comment on attachment 555998 [details] [diff] [review] Part 2 - Implement word-break property v3 Review of attachment 555998 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/base/src/nsLineBreaker.cpp @@ +430,1 @@ > breakState.Elements() + wordStart); I moved this hunk into the first patch so that each patch builds independently ::: layout/generic/nsTextFrameThebes.cpp @@ +1769,5 @@ > + if (wordBreak != NS_STYLE_WORDBREAK_NORMAL) { > + PRUint8 breakRule; > + switch (wordBreak) { > + case NS_STYLE_WORDBREAK_BREAK_ALL: > + breakRule = nsILineBreaker::kBreakMode_BreakAll; s/BreakMode/WordBreak/ in this patch too ::: layout/reftests/text/wordbreak-1-ref.html @@ +1,1 @@ > +<html> I added <!DOCTYPE html> to these tests per comment 60 ::: layout/style/nsCSSProps.cpp @@ +1343,1 @@ > const PRInt32 nsCSSProps::kWordwrapKTable[] = { I capitalized WordWrap as well per comment 34

Simon Montagu :smontagu

Comment 67

•

13 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/4acd1e285cc4 https://hg.mozilla.org/integration/mozilla-inbound/rev/2998b1105a11 https://hg.mozilla.org/integration/mozilla-inbound/rev/78c508bfee65 https://hg.mozilla.org/integration/mozilla-inbound/rev/7faad86fbada https://hg.mozilla.org/integration/mozilla-inbound/rev/863fe7093e30 https://hg.mozilla.org/integration/mozilla-inbound/rev/1dbfff7cf0ab

Flags: in-testsuite+

Target Milestone: --- → mozilla15

Ed Morley [:emorley]

Comment 68

•

13 years ago

https://hg.mozilla.org/mozilla-central/rev/4acd1e285cc4 https://hg.mozilla.org/mozilla-central/rev/2998b1105a11 https://hg.mozilla.org/mozilla-central/rev/78c508bfee65 https://hg.mozilla.org/mozilla-central/rev/7faad86fbada https://hg.mozilla.org/mozilla-central/rev/863fe7093e30 https://hg.mozilla.org/mozilla-central/rev/1dbfff7cf0ab

Status: ASSIGNED → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Joe Drew (not getting mail)

Comment 69

•

13 years ago

This is a fennec 1.0 blocker; can you either nom the relevant patches, or find some other way of fixing it?

Jet Villegas (inactive)

Comment 70

•

13 years ago

I don't think we'll have any other way to fix this, but to take the entire patch queue. Fennec Triagers: this bug implements a missing feature and is plenty of code. I recommend we let it into Fennec to avoid shipping with bug 748249.

Jean-Yves Perrier [:teoli]

Comment 71

•

13 years ago

Just curious, why is this a blocker? The comments on this bug asking for blocking and granting the block doesn't explain why. This is of interest for documenting the feature: it may help us find practical examples of use.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 72

•

13 years ago

Because it's our only failure in Ring 0 of the Ringmark test suite, which has gotten a bit of attention lately.

Jonathan Kew [:jfkthame]

Comment 73

•

13 years ago

Attached patch roll-up patch for Aurora — Details — Splinter Review

This is a roll-up of the 6 patches landed above, rebased to current Aurora tip. (Just needed minor context change in nsStyleStruct.) Carrying forward r+ from the individual patches. I'm currently building aurora tip + this patch locally to check that it all transplanted ok, but don't expect any problems.

Attachment #623622 - Flags: review+

Jonathan Kew [:jfkthame]

Comment 74

•

13 years ago

Comment on attachment 623622 [details] [diff] [review] roll-up patch for Aurora [Approval Request Comment] Regression caused by (bug #): new feature wanted for fennec User impact if declined: fail to complete Ringmark ring 0 Testing completed (on m-c, etc.): landed with unit tests Risk to taking this patch (and alternatives if risky): Although the patch is quite big, the behavior change is pretty localized and should not affect pages that don't actually use the word-break feature, so I think the risk is reasonably low. If the patches regressed existing line-breaking behavior, unit tests on m-c would likely have detected this already. String changes made by this patch: none

Attachment #623622 - Flags: approval-mozilla-aurora?

JP Rosevear [:jpr]

Comment 75

•

13 years ago

This was accidently blocking +'ed by me last week. This patch affects desktop to, there is no need to rush this through into aurora, unless this fixes other blockers.

blocking-fennec1.0: + → -

JP Rosevear [:jpr]

Updated

•

13 years ago

Attachment #623622 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora-

Jean-Yves Perrier [:teoli]

Comment 76

•

13 years ago

Kato-san created: https://developer-new.mozilla.org/en-US/docs/CSS/word-break and https://developer-new.mozilla.org/en-US/docs/Firefox_15_for_developers is up-to-date. We are done here. (Slightly bettered on Kuma, soon on the regular MDN site)

Keywords: dev-doc-needed → dev-doc-complete

Daniel Holbert [:dholbert]

Updated

•

12 years ago

Depends on: 780680

Tooru Fujisawa [:arai]

Updated

•

8 years ago

Updated

•

8 years ago

Depends on: 1296042

You need to log in before you can comment on or make changes to this bug.