Closed Bug 217749 Opened 21 years ago Closed 20 years ago

a text is broken at a boundary of the end tag

Categories

(Core :: Layout: Text and Fonts, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: hsaito54, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: testcase, Whiteboard: [line-breaking])

Attachments

(6 files, 2 obsolete files)

A following text is broken at a boundary of the end tag of the 'I' element, if
the dimension of the 'I' element is larger than the width of the window.

<body>
<i>aaaaaaaaaaaaaaaaaaaaaaaaaaaaa</i>bbbbbbbbbb
</body>

The following patch may be able to solve this problem by skipping to count up
the number of frames of the leading whitespace, since the function of
nsLineLayout::LineIsBreakable() may return PR_FALSE and set the value of
aTextData.mCanBreakBefore of nsTextFrame::MeasureText() set to PR_FALSE.

--- ./layout/html/base/src/nsLineLayout.cpp2	Sat Jul  5 21:23:27 2003
+++ ./layout/html/base/src/nsLineLayout.cpp	Sat Aug 30 02:06:36 2003
@@ -1619,7 +1619,10 @@
   }
 
   // Count the number of frames on the line...
-  mTotalPlacedFrames++;
+  // Skip over the leading whitespace
+  if (psd->mX) {
+    mTotalPlacedFrames++;
+  }
   if (psd->mX != psd->mLeftEdge || pfd->mBounds.x != psd->mLeftEdge) {
     // As soon as a frame placed on the line advances an X coordinate
     // of any span we can no longer place a floater on the line.
Attached file testcase
You're saying that it's not a line break opportunity, right? I'm not sure of
that....
OS: Windows XP → All
Hardware: PC → All
for the following test case, the line should not be broken in the second case.

<body>
<i>aaaaaaaaaaaaaaaaaaaaaaaaaaaaa</i>bbbbbbbbbb
</body>

|<----- width of the window ----->|
aaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbb

|<- width of the window ->|
aaaaaaaaaaaaaaaaaaaaaaaaaaaaa
bbbbbbbbbb
That is what I'm not sure of. It seems to me that switching to <i> could be
regarded as a line break opportunity even in Latin script. Actually, I guess the
test case is not realistic in that it's hard to image someone switching to <i>
in the middle of a word in Latin-based text.
This is part of a DNA sequence with a highlighted part (this could be for
example the result page of a search, kind of what Google does in its cached
pages).
I have no problem imagining several other applications where styling
part of word would be useful and where it MUST NOT change line-breaking.

From http://www.w3.org/TR/css3-text/#line-breaking
---------
In documents written in Latin-based languages, where runs of characters make up
words and words are separated by spaces or hyphens, line breaking is relatively
simple. In the most general case, (assuming no hyphenation dictionary is
available to the UA), a line break can occur only at white space characters or
hyphens, including U+00AD SOFT HYPHEN.
---------

IMO, the current behaviour is clearly a bug.
Keywords: testcase
I'm familiar with what you quoted as well as with UAX #14 (Line breaking
properties). That's why I wrote 'even in Latin script' and 'could be'. Anyway,
your testcase  convinced me that we should not break lines there. It also
reminded me of cases where style-change is used in the middle of an English word
(not DNA sequence :-))
Please refer to the function of nsTextFrame::MeasureText() at the file of
./layout/html/base/src/nsTextFrame.cpp.

For the following sequence of program, in the case of seeing " A B", the value
of firstWordDone is set to PR_TRUE after read 'A', however the value of
firstWordDone should be set to PR_TRUE after next whitespace is read since the
leading whitespace is skipped over.

nsTextFrame::MeasureText()
{
  PRBool firstThing = PR_TRUE;

  for (;;firstThing = PR_FALSE) {

    bp2 = aTx.GetNextWord();

    if (!aTextData.mCanBreakBefore && !firstThing && !isWhitespace) {
      firstWordDone = PR_TRUE;
    }

    if (isWhitespace) {
    } else {
      if (aTextData.mWrapping && firstThing && (firstChar == ' ')) {
        textStartsWithNBSP = PR_TRUE;
      }

I think that this sequence of program is able to change as following.

nsTextFrame::MeasureText()
{
  PRInt32 firstThing = 0;

  for (;;) {

    bp2 = aTx.GetNextWord();

    if (!aTextData.mCanBreakBefore && firstThing >= 1 && !isWhitespace) {
      firstWordDone = PR_TRUE;
    }

    if (isWhitespace) {
    }
    else {
      firstThing++;

      if (aTextData.mWrapping && firstThing > 1 && (firstChar == ' ')) {
        textStartsWithNBSP = PR_TRUE;
      }
The changes of comment #8 is able to solve the problem of Bug 187899(table cells
overlap if wrapped before "&nbsp;").
The current behavior is definitely a bug.
Whiteboard: [line-breaking]
Attached patch patch (obsolete) — Splinter Review
The patch has two changes since a leading white space is skipped.
For the testcase2, firstThing is set to false in case of the firstChar is not
white space. The modification previents a leading white space setting
aTextData.mCanBreakBefore to true.
For the testcase, the modification previents adding the frame of a leading
white space to the number of frames on the line.
Attached file another testcase
Attached patch patch (obsolete) — Splinter Review
The patch has two parts, the changes for nsLineLayout.cpp and for
nsTextFrame.cpp. The former changes are for testcase and testcase #2, the
latter changes are for another testcase. I think that the both changes skip
over the leading whitespace.
Attachment #132674 - Attachment is obsolete: true
(In reply to comment #12)
> Created an attachment (id=160822)
> another testcase

This testcase is for windows only, because it is a condition that
measureTextRuns boolen is true.
Attached patch patchSplinter Review
As follows, there are two kinds of problems. It causes a line to break that a
leading whitespace sets true to aTextData.mCanBreakBefore.

Type1:
  <html><body>
  <i>aaaaaaaaaa</i>bbbbbbbbbb
  </body>

  first frame has a leading whitespace only.
  second frame has 'aaaaaaaaaa'.

  The number of frames is counted, although the leading whitespace is skipped.

Type2:
  <html><body>
  bbbbbbbbbb<i>aaaaaaaaaa</i>
  </body>

  first frame has a leading whitespace and 'bbbbbbbbbb'.
  second frame has 'aaaaaaaaaa'.

  The firstThing boolean is set to true, although the leading whitespace is
skipped.

In the both cases, the aTextData.mCanBreakBefore boolean is set to true.
If aTextData.mX is greater than maxWidth, the multiple word width can not be
computed.
Please refer to the following flow of nsTextFrame::MeasureText.

  if (!lineLayout.InWord()) {

    if (endsInWhitespace) {

    }
    else if ((aTextData.mOffset == contentLength) && (prevOffset >= 0)) {

      if (!aTextData.mWrapping) {
	aTextData.mCanBreakBefore = PR_FALSE;
      }

      if (!aTextData.mCanBreakBefore || (aTextData.mX <= maxWidth)) { <====
here!!!

	nsIFrame* next = lineLayout.FindNextText(aPresContext, this);
	if (nsnull != next) {
Attachment #160826 - Attachment is obsolete: true
Attachment #161495 - Flags: review?(rbs)
I don't think changing |firstThing| is what is intended. You are turning
|firstThing| into meaning |firstWord|.

-------------

There is a two-way dependency between |aTextData.mCanBreakBefore| and
|firstWordDone|.

    // We need to set aTextData.mCanBreakBefore to true after 1st word. But we
can't set 
    // aTextData.mCanBreakBefore without seeing the 2nd word. That's because
this frame 
    // may only contain part of one word, the other part is in next frame. 
    // we don't care if first word is whitespace, that will be addressed later. 
    if (!aTextData.mCanBreakBefore && !firstThing && !isWhitespace) {
      firstWordDone = PR_TRUE;
    }
[...]
      if (firstWordDone) {
        // The first word has been processed, and 2nd word is seen 
        // we can set it be breakable here after.
         aTextData.mCanBreakBefore = PR_TRUE;
      }


Getting this two-way dependency right has been the source of some troubles, but
the dependency seems unavoidable. So suppose that the logic is bug-free for a
moment.

--------------

What intrigues me is what MeasureText() is doing in the |if| you mentioned:
   if (!lineLayout.InWord()) {

With the fragment bbbbbbbbbb<i>aaaaaaaaaa</i>, why is the code reaching that
|if|, given that lineLayout should be InWord when processing <i>aaaaaaaaaa</i>.
With the fragment bbbbbbbbbb<i>aaaaaaaaaa</i>, I indicate the value of the each
boolean for the following code.

-start--------------------
  PRBool firstThing = PR_TRUE;

  for (;;firstThing = PR_FALSE) {

    bp2 = aTx.GetNextWord(

    if (!aTextData.mCanBreakBefore && !firstThing && !isWhitespace) {
      firstWordDone = PR_TRUE;
    }

    if (bp2) {
      if (firstWordDone) {
        // The first word has been processed, and 2nd word is seen 
        // we can set it be breakable here after.
         aTextData.mCanBreakBefore = PR_TRUE;
      }
-end-----------------------

    *bp2        firstThing   firstWordDone  aTextData.mCanBreakBefore
 white space      true           false              false
 bbbbbbbbbb       false          true               true

Although the first white space is skipped, when second processing bbbbbbbbbb,
both boolean firstWordDone and aTextData.mCanBreakBefore are set to true.
Then, the fragment of bbbbbbbbbb can not measure with <i>aaaaaaaaaa</i>.
Comment on attachment 161495 [details] [diff] [review]
patch

r=rbs

I wonder about |emptyFrame| vs. an "empty line". But this looks good enough to
me, and can be refined further if it is discovered that it doesn't fix all the
cases.
Attachment #161495 - Flags: review?(rbs) → review+
Attachment #161495 - Flags: superreview?(rbs)
Comment on attachment 161495 [details] [diff] [review]
patch

sr=rbs

At checkin time, it might be worth updating the comment to "number of non-empty
frames"

   // Count the number of frames on the line...
-  mTotalPlacedFrames++;
+  if (!emptyFrame) {
+    mTotalPlacedFrames++;
+  }
Attachment #161495 - Flags: superreview?(rbs) → superreview+
I corrected the comment. Could you please check into the trunk?
Checked in.
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: