Closed
Bug 732925
Opened 12 years ago
Closed 11 years ago
Crash in gfxGDIFont::InitTextRun
Categories
(Core :: Graphics, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: scoobidiver, Unassigned)
References
Details
(Keywords: crash)
Crash Data
Attachments
(1 file)
731 bytes,
patch
|
benjamin
:
review-
|
Details | Diff | Splinter Review |
It happens mainly on Windows XP (90% of crashes). Signature _alloca_probe | gfxGDIFont::InitTextRun(gfxContext*, gfxTextRun*, wchar_t const*, unsigned int, unsigned int, int, bool) More Reports Search UUID 66c0d5b5-07af-4e91-a193-bfee62120305 Date Processed 2012-03-05 12:10:28 Uptime 599 Last Crash 20.3 hours before submission Install Age 20.2 hours since version was first installed. Install Time 2012-03-04 15:53:56 Product Firefox Version 11.0 Build ID 20120228210006 Release Channel beta OS Windows NT OS Version 5.1.2600 Service Pack 3 Build Architecture x86 Build Architecture Info GenuineIntel family 6 model 28 stepping 10 Crash Reason EXCEPTION_ACCESS_VIOLATION_READ Crash Address 0x10f000 App Notes AdapterVendorID: 0x8086, AdapterDeviceID: 0xa011, AdapterSubsysID: 03491025, AdapterDriverVersion: 6.14.10.5260 D3D10 Layers? D3D10 Layers- D3D9 Layers? D3D9 Layers- Processor Notes EMCheckCompatibility True Total Virtual Memory 2147352576 Available Virtual Memory 1920266240 System Memory Use Percentage 62 Frame Module Signature [Expand] Source 0 xul.dll _alloca_probe chkstk.asm:99 1 xul.dll gfxGDIFont::InitTextRun gfx/thebes/gfxGDIFont.cpp:181 2 xul.dll gfxFont::SplitAndInitTextRun gfx/thebes/gfxFont.cpp:1594 3 xul.dll gfxFontGroup::InitScriptRun gfx/thebes/gfxFont.cpp:2624 4 xul.dll gfxFontGroup::InitTextRun gfx/thebes/gfxFont.cpp:2573 5 xul.dll gfxFontGroup::MakeTextRun gfx/thebes/gfxFont.cpp:2507 6 xul.dll TextRunWordCache::MakeTextRun gfx/thebes/gfxTextRunWordCache.cpp:859 7 xul.dll MakeTextRun layout/generic/nsTextFrameThebes.cpp:594 8 xul.dll ParseManifest xpcom/components/ManifestParser.cpp:476 9 @0x0 10 @0x80000009 11 xul.dll nsRuleNode::ComputeFontData layout/style/nsRuleNode.cpp:3162 12 xul.dll nsRuleNode::WalkRuleTree layout/style/nsStyleStructList.h:78 13 xul.dll nsRuleNode::WalkRuleTree layout/style/nsRuleNode.cpp:1944 14 xul.dll nsTextFrame::EnsureTextRun layout/generic/nsTextFrameThebes.cpp:2279 15 xul.dll nsTextFrame::ReflowText layout/generic/nsTextFrameThebes.cpp:7260 More reports at: https://crash-stats.mozilla.com/report/list?signature=_chkstk%20|%20gfxGDIFont%3A%3AInitTextRun%28gfxContext*%2C%20gfxTextRun*%2C%20wchar_t%20const*%2C%20unsigned%20int%2C%20unsigned%20int%2C%20int%2C%20bool%29 https://crash-stats.mozilla.com/report/list?signature=_alloca_probe%20|%20gfxGDIFont%3A%3AInitTextRun%28gfxContext*%2C%20gfxTextRun*%2C%20wchar_t%20const*%2C%20unsigned%20int%2C%20unsigned%20int%2C%20int%2C%20bool%29
Comment 1•12 years ago
|
||
It sounds to me (_chkstk, _alloca_probe) like we're running out of stack space. Might be worth trying to increase the stack size and see if that reduces these kind of crashes? It makes sense that this would happen primarily on XP: first, XP always uses the GDI font backend, whereas Vista and Win7 may be using DirectWrite instead (so the stacks would be different even if we ran into similar issues), and second, I wouldn't be surprised if later Windows versions allocate a larger stack in the first place.
Comment 2•12 years ago
|
||
On the assumption that these reports may suggest we're running too close to the limit of stack space on WinXP, what would you think of doing something like this to bump up the space a bit and see if that helps? I would think most XP users could afford to sacrifice an extra 0.5MB to the stack if it'll provide stability.
Attachment #602927 -
Flags: review?(dbaron)
Presumably this affects only the main thread and not other threads? What's using alloca, for what, and why? I don't really feel like I'm the right reviewer for this. Maybe bsmedberg?
Comment 4•12 years ago
|
||
(In reply to David Baron [:dbaron] from comment #3) > Presumably this affects only the main thread and not other threads? I believe so. > What's using alloca, for what, and why? I don't think it's called directly by our code - my guess is that it's used internally by Windows APIs (uniscribe?) that we call. > I don't really feel like I'm the right reviewer for this. Maybe bsmedberg? OK. (I picked you because you'd reviewed bug 582910, that's all. Redirecting to bsmedberg...)
Updated•12 years ago
|
Attachment #602927 -
Flags: review?(dbaron) → review?(benjamin)
Comment 5•12 years ago
|
||
See also bug 733372, where it looks like we're intermittently hitting a stack overflow during GC on Win32 debug builds, suggesting that perhaps we are routinely running a bit too close to the limit for comfort.
Comment 6•12 years ago
|
||
<jfkthame> no * askalski has quit (Quit: Wychodzi) <jfkthame> hmm, SetGlyphsFromRun doesn't seem to have as many autoarrays as i was thinking
Comment 7•12 years ago
|
||
The actual overflowing function is gfxHarfBuzzShaper::InitTextRun at http://hg.mozilla.org/releases/mozilla-beta/annotate/8c9e4873d419/gfx/thebes/gfxHarfBuzzShaper.cpp#l716 _chkstk is a compiler-generated call for any frame which is larger than one OS page. This frame appears to be 0x517C (20k) large, so _chkstk goes through and hits each page in order to trigger Windows guard page behavior. I'm worried that what we're actually seeing here is an infinite-recursion problem in code below the stack (the stack is equally or even more busted in MSVC than in crash-stats), and that increasing the main-thread stack size is not a good solution.
Comment 8•12 years ago
|
||
It looks like the compiler has inlined some of the harfbuzz functions that are called from gfxHarfBuzzShaper::InitTextRun, and so their stack frames get merged with this one. In current trunk code, I don't think this should be a major issue - it looks like the frame would add up to around 1.5K, provided I've found the main arrays involved: hb_shape > hb_shape_full > > hb_ot_shape: - hb_ot_shape_plan_t - hb_ot_map_t - hb_tag_t (4) * 2 8 - feature_map_t (64) * 8 512 - lookup_map_t (8) * 32 256 - pause_map_t (12) * 1 12 > > > hb_ot_shape_plan_internal - hb_ot_shape_planner_t - hb_ot_map_builder_t - hb_tag_t (4) * 2 8 - feature_info_t (28) * 16 448 - pause_info_t (12) * 1 * 2 24 plus various miscellaneous variables, pointers, etc. However, an earlier version of the HB code (as found in current mozilla-beta) had much larger stack allocations for the feature and lookup arrays, which explains the big frame we're seeing in the beta crashes. Still, if attempting to allocate a 20K stack frame (in a non-recursive function) can cause a crash, I think that could suggest we're running dangerously close to the edge. Of course, it's possible we have an infinite-recursion problem somewhere below this, in which case enlarging the stack will only delay the crash and not prevent it. But IMO it'd be worth trying, to see if it eliminates these reports.
Comment 9•12 years ago
|
||
There are some fairly large stack frames in here, but after examining the minidump more carefully, we're only using about 110k of stack total: At the access violation at 0x10f000 according to the stack scan, ESP of XRE_main is 0x0012f108 That's 131336 bytes. Unless we are mistakenly setting the stack size to *less* than 1MB, I don't think that's the problem here, although I don't know what the problem actually is. I would suspect mandatory ASLR if that were actually enabled, but that was backed out. Scoobidiver, is there a way to see if aggregate chkstk/alloca_probe crashes went up on any particular nightly?
Reporter | ||
Comment 10•12 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #9) > Scoobidiver, is there a way to see if aggregate chkstk/alloca_probe crashes > went up on any particular nightly? You need to file a bug to reprocess those crash reports from a date before bug 727963 landed.
Updated•12 years ago
|
Attachment #602927 -
Flags: review?(benjamin) → review-
Comment 11•12 years ago
|
||
There's some interesting discussion here: http://apocryph.org/2008/06/01/visual_c_apps_crashing_chkstk_under_load/ which mentions that one possible failure mode here is that _chkstk tries to commit previously reserved stack memory and fails because the system is out of physical memory. Looking at a few of these crash reports, they don't seem to be in that situation, though (judging from the reported numbers). It might be useful to run a report and check the system memory usage % numbers. There's also a MSDN article that describes the behavior of _chkstk: http://support.microsoft.com/kb/100775
Reporter | ||
Comment 12•11 years ago
|
||
There have been no crashes for the last four weeks after 11.0.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•