Closed Bug 585975 Opened 14 years ago Closed 6 years ago

Crash in UniscribeItem::Shape @ ClientData::GetOtlTable on Windows XP SP2

Categories

(Core :: Graphics, defect, P3)

x86
Windows XP
defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox16 - affected
blocking2.0 --- -

People

(Reporter: chofmann, Unassigned)

References

()

Details

(Keywords: crash, regression, reproducible, Whiteboard: [gfx-noted])

Crash Data

around in low volume on previous releases but looks like volume increase on this signature in firefox 4.0 betas.  appears to be all on winXP.

currently ranked #40 overall and #34 in non-plugin crashes

checking --- ClientData::GetOtlTable.long 20100809-crashdata.csv
found in: 4.0b2 3.6.8 4.0b1 3.0b1 3.6.6 3.0b4 4.0b3pre
release total-crashes
              ClientData::GetOtlTable.long crashes
                         pct.
all     286214  51      0.000178188
4.0b2   15087   32      0.00212103
3.6.8   178205  7       3.92806e-05
4.0b1   2672    4       0.00149701
3.0b1   429     3       0.00699301
3.6.6   13699   2       0.000145996
3.0b4   187     2       0.0106952
4.0b3pre        649     1       0.00154083

os breakdown
ClientData::GetOtlTable.longTotal 51
Win5.1  1.00
Win6.0  0.00
Win6.1  0.00

http://crash-stats.mozilla.com/report/index/b3432483-3d26-4720-b530-c5c712100730

Frame  	Module  	Signature [Expand]  	Source
0 	usp10.dll 	ClientData::GetOtlTable 	
1 	usp10.dll 	otlResourceMgr::getOtlTable 	
2 	usp10.dll 	SubstituteOtlChars 	
3 	usp10.dll 	OtlShape 	
4 	xul.dll 	UniscribeItem::Shape 	gfx/thebes/gfxUniscribeShaper.cpp:121
5 	xul.dll 	gfxUniscribeShaper::InitTextRun 	gfx/thebes/gfxUniscribeShaper.cpp:610
6 	xul.dll 	gfxGDIFont::InitTextRun 	gfx/thebes/gfxGDIFont.cpp:179
7 	xul.dll 	gfxTextRun::AddGlyphRun 	gfx/thebes/gfxFont.cpp:3665
8 		@0xf 	
9 		@0x730072 	
10 	nss3.dll 	CERT_MapStanError 	security/nss/lib/certdb/stanpcertdb.c:210
11 	nss3.dll 	pkix_pl_String_Comparator 	security/nss/lib/libpkix/pkix_pl_nss/system/pkix_pl_string.c:79
12 	xul.dll 	nsLineLayout::ReflowFrame 	layout/generic/nsLineLayout.cpp:820
13 	xul.dll 	nsRuleNode::GetStyleData 	layout/style/nsStyleStructList.h:145
14 	xul.dll 	MakeTextRun 	layout/generic/nsTextFrameThebes.cpp:470
15 	xul.dll 	nsTArray<unsigned int>::AppendElement<unsigned int> 	obj-firefox/dist/include/nsTArray.h:671
16 		@0x1265c3 	
17 		@0x11 	
18 	xul.dll 	nsRuleNode::GetStyleData 	layout/style/nsStyleStructList.h:145
19 	xul.dll 	nsCSSSVG::~nsCSSSVG 	layout/style/nsCSSStruct.cpp:467
20 		@0x13

more at

http://crash-stats.mozilla.com/report/list?range_value=2&range_unit=weeks&date=2010-08-10%2006%3A00%3A00&signature=ClientData%3A%3AGetOtlTable%28long%2C%20unsigned%20char%20const**%2C%20unsigned%20long*%29&version=Firefox%3A4.0b2

one user commented they thought they hit the crash when visiting https://addons.mozilla.org/

some other test urls are

   3 http://www.stargazete.com/
   3 http://www.naver.com/
   2 http://www.google.ro/imgres?imgurl=http://media.timisoreni.ro/poze-upload/small/DJ_Andi_si_Stella_in_Heaven_Studio_065
.jpg&imgrefurl=http://www.dinclub.ro/club-heaven-studio-timisoara/&usg=__SwtfqhS0vosZW5falNp6VpOVfvU=&h=352&w=527&sz=18&hl=
ro&start=191&t
   2 http://www.facebook.com/?ref=home
   2 http://vkontakte.ru/wilizard
   2 http://taringa.net/posts/juegos/4015306/Querias-mas!-PS2---Coleccion-Guitar-Hero-Equot;SilverEquot.html#
   2 http://members.brazzers.com/members/index.php?action=live
   2 \N
is there a better component for this?
Severity: normal → critical
Component: GFX: Color Management → Graphics
QA Contact: color-management → thebes
Assignee: nobody → jdaggett
its currently rank 18 in the top crash list for Beta 5
blocking2.0: --- → ?
This seems to be a Dupe of/related to Bug 558925.
See Bug 558925 Comment 4 of the common Factor of these Crashes:
USP10.dll Version 1.420.2600.2180
topcrash implies blocks final.
blocking2.0: ? → final+
Not in the top 300. Does this still happen?
blocking2.0: final+ → -
(In reply to comment #6)
> Not in the top 300. Does this still happen?

yes, still around in beta7 but at lower volume, and seen once in awhile on mozilla-central b8, but not enough to block.


         ClientData::GetOtlTable.long,.unsigned.char.const..,.unsigned.long..
date     total    breakdown by build
         crashes  count build, count build, ...

20101210 54  	21 4.0b22010072019, 
        		12 4.0b72010110414, 	10 4.0b62010091408, 
        		5 4.0b42010081813, 	2 4.0b52010083108, 
        		1 4.0b32010080519, 	1 4.0b12010063014, 
        		1 3.6.132010120307, 	1 3.6.122010102621, 
20101211 27  	7 4.0b62010091408, 
        		6 4.0b32010080519, 	4 4.0b72010110414, 
        		4 4.0b12010063014, 	2 4.0b42010081813, 
        		1 4.0b22010072019, 	1 3.6.132010120307, 
        		1 3.5.152010102620, 	1 3.0b42008030714, 
20101212 24  	6 4.0b72010110414, 
        		5 4.0b62010091408, 	3 3.6.132010120307, 
        		2 4.0b52010083108, 	2 4.0b42010081813, 
        		2 3.6.122010102621, 	1 4.0b32010080519, 
        		1 4.0b12010062821, 	1 3.6.82010072215, 
        		1 3.5.162010113007,
If Bug 607161 is indeed a duplicate of this, there is a test case (on that bug) to reliably replicate the crash.
Crash Signature: [@ ClientData::GetOtlTable(long, unsigned char const**, unsigned long*) ]
Comment copied from bug 607161...

SeaMonkey 2.1 crashes instantly when trying to load pages with some .ttf files (which worked fine in 1.0->2.0.x). Most notably this seems to have affected Nanum Myeongjo (나눔명주), a popular and free Korean font which can e.g. be downloaded here: http://hangeul.naver.com/download.nhn (http://cdn.naver.com/naver/NanumFont/fontfiles/NanumFont_TTF_ALL.zip)

The crash does not happen with every page using the font (appears to be dependent on the glyphs used) but pages with any significant amount of text will crash the browser on loading. The font is called locally (not as a webfont)

Only the .ttf versions of the font (Nanum Myeongjo) are affected.

Trying to narrow the problem down it seems to be caused by a regression in handling nonstandard .ttf data: Opening and resaving the font in fontforge solved the problem and gave me the following warning, which may be related to the crash:

"Glyph 20248 is called ".notdef", a singularly inept choice of name (only glyph 0 may be called .notdef)"

(this warning was absent in fonts that did not cause a crash)

Reproducible: Always

attachment 485950 [details]
Keywords: testcase
It's #72 top browser crasher in 14.0.1, #51 in 15.0b2, and #86 in 16.0a2.
Summary: Firefox 4.0b2 Crash Report [@ ClientData::GetOtlTable(long, unsigned char const**, unsigned long*) ] → Crash in UniscribeItem::Shape @ ClientData::GetOtlTable
It's #57 top browser crasher in 14.0.1, #20 in 15.0b6, #37 in 16.0a2, and #137 in 17.0a1.
Keywords: topcrash
It's #43 top browser crasher in 15.0.1, #13 in 16.0b2, #48 in 17.0a2, and #79 in 18.0a1.
This signature has been rising in FF16, so it makes sense to investigate further (although likely not a blocker). Let's see if comment 9 still reproduces, and let's also check crashing URLs.
Mozilla/5.0 (Windows NT 5.1; rv:16.0) Gecko/20100101 Firefox/16.0

Installed the font from comment 9. Loaded sites from comment 9, 14 and comments related to this crash (mostly Korean sites). Unable to reproduce on Windows XP.
I suspect that the URLs in comment 14 might help make this reproducible. In particular, southmp3.org may be using a font (either via web fonts or picking up a system font) which is either directly causing the crash or is statistically related. I suspect that valgrind/ASAN runs browing on this site might turn up something, as well as manually inspecting the CSS to get potential fonts...
The bug is actually a crash in Uniscribe code under XP.  By default we
use Harfbuzz for a select set of scripts on XP, all others we use
Uniscribe.  We've seen this crash in the past and worked around it by
forcing GDI to be used instead for certain font types.

Analyzed the roughly 1300 crash reports with this signature for
Firefox 15.  For almost all of these the Uniscribe DLL version is
1.420.2600.2180.  This is the version that shipped with Windows XP SP2
(SP3 ships with 1.420.2600.5512 which we never see in the crash
reports for this bug).

So I think the first step is to try these steps to see if we can come
up with a reproducible test case:

1. Install Windows XP SP2
2. Disable HarfBuzz shaping completely by setting 'gfx.font_rendering.harfbuzz.scripts' to 0 (this will force all text
to use Uniscribe shaping rather than HarfBuzz)
3. Try the various URL's in comment 14

Once we have a reproducible testcase we can begin to figure out how to
work around the problem fonts, if that's what's causing the problem.
Keywords: testcase
Summary: Crash in UniscribeItem::Shape @ ClientData::GetOtlTable → Crash in UniscribeItem::Shape @ ClientData::GetOtlTable on Windows XP SP2
QA Contact: jbecerra
Given that we only send (some) non-Latin scripts through Uniscribe, it seems likely that this is being triggered by a particular font, perhaps for an Indic or SEAsian language, that makes Uniscribe fail (perhaps only with certain text sequences). The URLs with "telugu" in them, plus the presence of some .in domains, suggests that a Telugu font might be involved. Quite likely, though, it won't be a standard Windows font but a 3rd-party one that some users have installed.

One solution here would be to turn on HarfBuzz shaping for all text, now that the HB engine has good support for Indic scripts. However, I'd really like us to have more in-depth testing deployed (bug 770591) before we make that switch on Windows, to minimize the risk of regressions in the rendering of these more complex scripts.
Able to reproduce reliably with instructions in comment #17 (XP SP2) and visiting cinespice.in from comment #14 using a 15.0.1 build: bp-0c7d041a-24ab-4d64-b19c-d9dec2120918
Keywords: reproducible
Keywords: verifyme
16.0b3 is also affected under these conditions.
That's great - I was concerned this might be really hard to reproduce. Does it also reproduce with current Nightly? (I'm guessing it will.)

And if you set gfx.font_rendering.harfbuzz.scripts = -1 in about:config, does that prevent the crash?
Reduced testcase (based on the problem font on cinespice.in):

1. Install Windows XP SP2 (*don't* update!)
2. Disable HarfBuzz shaping completely by setting 'gfx.font_rendering.harfbuzz.scripts' to 0 (this will force all text
to use Uniscribe shaping rather than HarfBuzz)
3. Click on URL

Result: crash in Uniscribe code but clearly some sort of heap corruption

Note: by specifying 'text-rendering: optimizeLegibility;' Chrome on SP2 also crashes, so this really is a Uniscribe problem that we're hitting, not just some Gecko-related memory corruption that trips up Uniscribe.  

Just before we crash Uniscribe reports a shaping error with the font, so the question is whether we can catch the error condition and avoid Uniscribe for that font (which is what the mForceGDI flag was meant to enforce).  Or we could sniff the version of Uniscribe and avoid Uniscribe except for complex scripts for the suspect versions.

Users really, really, really should be upgrading, SP2 was released in 2004(!!!).
(In reply to Jonathan Kew (:jfkthame) from comment #18)

> One solution here would be to turn on HarfBuzz shaping for all text, now
> that the HB engine has good support for Indic scripts. However, I'd really
> like us to have more in-depth testing deployed (bug 770591) before we make
> that switch on Windows, to minimize the risk of regressions in the rendering
> of these more complex scripts.

One slight variant of this would be to enable Harfbuzz shaping for all scripts on machines with older versions of Uniscribe.  That way we could eliminate the crash without affecting a wide swath of users just yet.  I forgot to add that the offending font on the cinespice.in page is provided via the Google Fonts service (!!!), it's not just some random crap font.
I went through the URL's in comment 14 again, most of them will cause the crash with the steps given in comment 17.  *ALL* of the URL's that result in a crash use the version of Oswald served by Google Fonts.

I dumped out the font that Google serves up and can reproduce the problem locally using that font.  If I strip out the GPOS table, Uniscribe shaping errors occur but no crash occurs.  The most recent released version of the font does not cause errors or crashes:

https://github.com/vernnobile/OswaldFont/blob/master/1.0/Regular/FINAL/Oswald-Regular.ttf

While I think we should see if there's some way to workaround the problem here, I think for now the best thing to do is ask the Google Fonts folks to update the version of Oswald they're serving.
(In reply to John Daggett (:jtd) from comment #23)
> (In reply to Jonathan Kew (:jfkthame) from comment #18)
> 
> > One solution here would be to turn on HarfBuzz shaping for all text, now
> > that the HB engine has good support for Indic scripts. However, I'd really
> > like us to have more in-depth testing deployed (bug 770591) before we make
> > that switch on Windows, to minimize the risk of regressions in the rendering
> > of these more complex scripts.
> 
> One slight variant of this would be to enable Harfbuzz shaping for all
> scripts on machines with older versions of Uniscribe.  That way we could
> eliminate the crash without affecting a wide swath of users just yet. 

I'm afraid that won't help after all (unless we were to "lock" the choice so that people can't mess with it in about:config). It looks like Oswald is a purely Latin-script font, which means the people hitting this crash must be people who have explicitly turned off harfbuzz shaping even for Latin.
I searched the addons MXR for "harfbuzz.scripts" and didn't find any hits. I'm at something of a loss to explain why this would show up in the topcrash report if it required manual pref flips; normally something like this would be an addon performing the pref flip, perhaps to solve country-specific rendering issues?

This is not just a few users reporting lots of crashes: the install times reported for recent 15.0.1/16/17 crashes show a reasonbly large distribution of installs.

kev, can you work with google on getting the updated font pushed out?
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #26)
 
> kev, can you work with google on getting the updated font pushed out?

I already contacted them, the new version of Oswald is slated to go into production sometime next week.
(In reply to John Daggett (:jtd) from comment #27)
> (In reply to Benjamin Smedberg  [:bsmedberg] from comment #26)
>  
> > kev, can you work with google on getting the updated font pushed out?
> 
> I already contacted them, the new version of Oswald is slated to go into
> production sometime next week.

No need to track for release in that case.
According to the Google Font folks, an update to Oswald was pushed out 9/26 so most sites would have picked it up within a day of that.

From the crash stat reports on Windows with a signature containing "GetOtlTable", number of reports for the week before the date listed:

  10/10  1495
  10/03  1605
  09/26  6455
  09/19  9379
  09/12 10561
  09/05  8786

So the number of crashes has been reduced but there are still crashes occurring.  Need to investigate more the fonts involved, especially since a number of these seem to include Facebook URL's.

One potential "fix" would be to force GDI usage for downloadable fonts when running on SP2 machines.  Since this is a Uniscribe bug we can only work around the problem.
This is a near-null read deref, so I don't think we're exposing a security issue here.

How close are we to turning on harfbuzz by default? Would it be better to focus on that instead of doing this workaround? Or even just implementing a warning for people with XP2 that they really need to upgrade since their OS is insecure?
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #30)

> How close are we to turning on harfbuzz by default? Would it be better to
> focus on that instead of doing this workaround?

We just now enabling it for all scripts by default on Linux/OSX so I think Windows would be possible in the near-term future.  I wasn't advocating the workaround per-se, just listing it as an option.

>Or even just implementing a
> warning for people with XP2 that they really need to upgrade since their OS
> is insecure?

Regardless of this bug, I think that's a good idea, SP2 was originally released in 2004!!
In Firefox 28, we've switched to harfbuzz shaping for almost all text on Windows; the one exception that remains is hangul (Korean). So the frequency of this may drop in FF28.

However, it looks like some of the crashes here have specifically involved Korean sites/fonts, so it won't be completely resolved until we get the harfbuzz hangul shaper done.
The switch to harfbuzz shaping on Windows (except for Korean) was bug 797405, btw.
Crash Signature: [@ ClientData::GetOtlTable(long, unsigned char const**, unsigned long*) ] → [@ ClientData::GetOtlTable(long, unsigned char const**, unsigned long*) ] [@ ClientData::GetOtlTable ]
Assignee: jd.bugzilla → nobody
This definitely still reproduces but is far less frequent in current builds (as comment 34 alluded to).
> Firefox >=46: 11
> Firefox <=28: 26

Keeping this open for now but I'm not sure it's realistic to think this will ever get fixed.
Whiteboard: [gfx-noted]
What crash reports are you seeing in Firefox >= 46? The code involved in the original report here (UniscribeItem::Shape, etc) was removed from gecko in bug 985220....
Flags: needinfo?(anthony.s.hughes)
As an example see https://crash-stats.mozilla.com/report/index/b1198eee-db20-4c81-a504-b1eb02160725.
Flags: needinfo?(anthony.s.hughes)
OK, but that's quite different from the original crash here. It's deep inside Uniscribe code that's being called from within cairo, whereas the original report here was about Gecko's Uniscribe-based shaping backend (which no longer exists).

Looking at the stack in that crash, I'm suspicious of the Screen Scraper SDK that's apparently installed and is intercepting the Windows ExtTextOut API. From the pile of Uniscribe functions that it then calls into, it looks like it's trying to re-do the text shaping, etc., which has already been done before we ever called cairo_show_glyphs. That's doomed to fail because we're not passing character data, we're passing already-shaped glyphs. If the screen scraper stuff (in agtpchnt.dll) isn't properly respecting the ETO_GLYPH_INDEX flag passed to ExtTextOut but treats the glyph indexes as character data, that could easily explain the mess here.

It'd be interesting to know whether agtpchnt.dll is a frequent contributor to these crashes...
Only two crashes (only on windows XP) since the last 6 months, I think we can close it...
Status: NEW → RESOLVED
Closed: 6 years ago
Keywords: verifyme
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.