Hang with long line of text and page break character

VERIFIED FIXED in mozilla1.9beta2

Status

()

defect
P3
critical
VERIFIED FIXED
12 years ago
6 years ago

People

(Reporter: martijn.martijn, Assigned: cpearce)

Tracking

({hang, regression, testcase})

Trunk
mozilla1.9beta2
x86
Windows XP
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9 +
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 7 obsolete attachments)

Reporter

Description

12 years ago
Posted file testcase
See testcase, which seems to hang in current trunk builds.
This seems to have regressed between 2007-06-20 and 2007-06-21, this is when the new textframe was turned on, bug 367177. So I guess it's a regression from that.

Note that the testcase has right at the end, just before the </sourcetext> tag, a page-break character. That seems to be necessary for the hang, apart from the long line of text.

Updated

12 years ago
Flags: blocking1.9?
Reporter

Comment 1

12 years ago
The testcase is still hanging, using:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a8pre) Gecko/2007090405 Minefield/3.0a8pre
This is after bug 388367 has been fixed.
Doesn't hang on Mac, Windows only?
Reporter

Comment 3

12 years ago
I guess so, it's still hanging with current trunk build on windowsXP.
Flags: blocking1.9? → blocking1.9+
Assignee

Comment 4

12 years ago
I can reproduce this on Vista. We're stuck in an infinite loop in gfxWindowsFontGroup::InitTextRunUniscribe(), gfxWindowsFonts.cpp:1594:

  while (FAILED(item->Shape())) {
      PR_LOG(gFontLog, PR_LOG_DEBUG, ("shaping failed"));
      // we know we have the glyphs to display this font already
      // so Uniscribe just doesn't know how to shape the script.
      // Render the glyphs without shaping.
      item->DisableShaping();
  }

item->Shape() is always failing, and we're stuck in a loop.

[TAKING]
Status: NEW → ASSIGNED
We should figure out why that's failing.

But I think we should also get rid of this loop here. Instead of looping, we should just disable shaping, then retry item->Shape() once. If the retry fails, we should warn and fill in the textrun with missing glyph boxes or something like that.
Component: Layout → GFX: Thebes
QA Contact: layout → thebes
Assignee: roc → chris
Status: ASSIGNED → NEW
Assignee

Comment 6

12 years ago
Ok, the reason that item->Shape() is failing, is that it's calling the Win32 Function ScriptShape(), which is returning E_INVALIDARG. It does this when the value of mMaxGlyphs which is passed into it is greater than 65535. So the solution is to change Itemize() such that it breaks up items which are longer than 65535 characters.
Status: NEW → ASSIGNED
Assignee

Comment 7

12 years ago
Patch changes Itemize() to detect uniscribe items which are too long and which would cause ScriptShape() to fail. It breaks items on cluster boundaries.
> NUM_MAX_GLYPHS

ESTIMATE_MAX_GLYPHS?

Anyway, I think there's a simpler way to do all this. Look at what gfxAtsuiFonts does. The basic idea would be to make InitTextRunUniscribe able to fill in just part of a textrun, i.e., you pass in an extra aStart parameter that tells you where to start filling in the textrun. Then you write a wrapper InitTextRun method that breaks the string up into chunks that aren't too long, and pass each chunk to InitTextRunUniscribe. gfxAtsuiFonts even has a FindTextRunSegmentLength function that tries to break the string up at whitespace boundaries, because that won't damage kerning or ligatures; you probably want to do the same. I think this approach would be simpler than messing with the itemization.

See
http://mxr.mozilla.org/seamonkey/source/gfx/thebes/src/gfxAtsuiFonts.cpp#512
Assignee

Comment 9

12 years ago
This patch follows ROC's suggestion...
Attachment #285024 - Attachment is obsolete: true

Comment 10

12 years ago
Comment on attachment 285038 [details] [diff] [review]
Patch v2 - wrap InitTextRunUniscribe()

I'm not sure I follow your ESTIMATE_MAX_GLYPHS code -- I measured a lot of content to come up with 40 as the right number for the average item length -- your code will result in those being 76

You also have mismatched SaveDC/RestoreDC calls here which could cause all sorts of problems.  I don't think you want to move the RestoreDC
Patch v2 has a problem --- the code that finds where to break up the textrun depends on cluster information, which we don't have yet (we do have it by then on Mac where this code was copied from). So we have do things differently. I think in the end patch v1 was closer to the mark, although it can be simplified a bit. Sorry for leading you down a blind alley Chris...
(In reply to comment #10)
> (From update of attachment 285038 [details] [diff] [review])
> I'm not sure I follow your ESTIMATE_MAX_GLYPHS code -- I measured a lot of
> content to come up with 40 as the right number for the average item length --
> your code will result in those being 76

Thanks Stuart, the ESTIMATE_MAX_GLYPHS is just me wrapping up a lot of (1.5*x)+16 into a macro, so I'm not changing any behaviour there. Remember 3*x/2 = 1.5*x. It's recommended here: http://www.microsoft.com/typography/developers/uniscribe/uniscribe.htm


> You also have mismatched SaveDC/RestoreDC calls here which could cause all
> sorts of problems.  I don't think you want to move the RestoreDC

Yikes, you're right, thanks for pointing that out! 

Patch v1 cleaned up a bit.
Attachment #285038 - Attachment is obsolete: true
+        if (mLength > 43680) {

ESTIMATE_MAX_GLYPHS(mLength) > 65535?

+            // Count number of items that are too long.
+            int count = TotalOversizeItems();
+            if (count > 0) {

I wouldn't bother with this check and TotalOversizeItems. Just go straight into the splitting-up loop. Don't need newNumItems either, or the SetCapacity call, it's OK to just add items to 'items' incrementally.

CopyItemSplitOversize can be simplified:

+        if (ESTIMATE_MAX_GLYPHS(itemLength) > 65535) {

I wouldn't bother with this.

Instead of computing the number of new items up front, do what gfxAtsuiFontGroup does; just repeatedly find the longest string that makes an acceptable item, add an item to aDest for that string, and then advance to the end of that string. Make sense? You don't need to bother with midpoints that way. You should do what FindTextRunSegmentLength does to prefer whitespace boundaries. In fact you can copy FindTextRunSegmentLength here, call it FindItemSegmentLength or something.
Posted patch Patch v4 - patch_v3++ (obsolete) — Splinter Review
Attachment #285168 - Attachment is obsolete: true
Posted patch Patch v5 - patch_v4++ (obsolete) — Splinter Review
Attachment #285194 - Attachment is obsolete: true

Updated

12 years ago
Priority: -- → P3
Assignee

Updated

12 years ago
Attachment #285196 - Flags: review?(pavlov)
Comment on attachment 285196 [details] [diff] [review]
Patch v5 - patch_v4++

Review cancelled... Roc pointed out an issue...
Attachment #285196 - Flags: review?(pavlov)
Posted patch Patch v6 (obsolete) — Splinter Review
Patch made more fail safe.
Attachment #285196 - Attachment is obsolete: true
Assignee

Updated

12 years ago
Attachment #285788 - Flags: review?(pavlov)

Updated

12 years ago
Target Milestone: --- → mozilla1.9 M10

Comment 19

12 years ago
Comment on attachment 285788 [details] [diff] [review]
Patch v6

+
+static PRUint32 FindNextItemStart(int aOffset, int aLimit,
+    nsTArray<SCRIPT_LOGATTR> &aLogAttr, const PRUnichar *aString)

nit: please align the params on the second line with the ones above.


+    // Try to start the next item before or after a space, since spaces
+    // don't kern or ligate.
+    PRUint32 off;
+    for (off = MAX_ITEM_LENGTH; off > 1; --off) {
+        if (aLogAttr[off].fCharStop &&
+            (aString[aOffset+off] == ' '
+            || aString[aOffset+off - 1] == ' '))
+            return aOffset+off;
+    }
+
+    // Try to start the next item at the last cluster boundary in the range.
+    for (off = MAX_ITEM_LENGTH; off > 1; --off) {
+        if (aLogAttr[off].fCharStop)
+            return aOffset+off;
+    }
+

you probably want to just cache the offset of the first fCharStop you encounter and get rid of the second loop.
Attachment #285788 - Flags: review?(pavlov) → review-
Attachment #285788 - Attachment is obsolete: true
Attachment #286739 - Flags: review?(pavlov)
This is patch v7 with the return value of the (boundary > 0) condition corrected.
Attachment #286739 - Attachment is obsolete: true
Attachment #286756 - Flags: review?(pavlov)
Attachment #286739 - Flags: review?(pavlov)

Updated

12 years ago
Attachment #286756 - Flags: review?(pavlov) → review+

Updated

12 years ago
Keywords: checkin-needed

Comment 22

12 years ago
checked in
Status: ASSIGNED → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → FIXED

Updated

12 years ago
Flags: in-testsuite?
Reporter

Comment 23

12 years ago
Wonderful, no hang anymore with current trunk build, thanks Chris.

Verified fixed, using:

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a9pre) Gecko/2007110705 Minefield/3.0a9pre
Status: RESOLVED → VERIFIED
(In reply to comment #12)
> ESTIMATE_MAX_GLYPHS is just me wrapping up a lot of
> (1.5*x)+16 into a macro, [...]. It's recommended here:
> http://www.microsoft.com/typography/developers/uniscribe/uniscribe.htm

Looks like that documentation has moved.  It seems the formula is recommended for an initial guess for the  cMaxGlyphs parameter of Uniscribe's ScriptShape:

http://msdn.microsoft.com/en-us/library/dd368564%28v=VS.85%29.aspx

"Maximum number of glyphs to generate, and the length of pwOutGlyphs. A reasonable value is (1.5 * cChars + 16), but this value might be insufficient in some circumstances. For more information, see the Remarks section."

"If this function returns E_OUTOFMEMORY, the application might call
ScriptShape repeatedly, with successively larger output buffers, until a large
enough buffer is provided. The number of glyphs generated by a code point
varies according to the script and the font. For a simple script, a Unicode
code point might generate a single glyph. However, a complex script font might
construct characters from components, and thus generate several times as many
glyphs as characters. Also, there are special cases, such as invalid character
representations, in which extra glyphs are added to represent the invalid
sequence. Therefore, a reasonable guess for the size of the buffer indicated
by pwOutGlyphs is 1.5 times the length of the character buffer, plus an
additional 16 glyphs for rare cases, for example, invalid sequence
representation."
crashtest:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b3276034cc07
Flags: in-testsuite? → in-testsuite+
You need to log in before you can comment on or make changes to this bug.