The default bug view has changed. See this FAQ.

eliminate top font chrome-hangs

NEW
Unassigned

Status

()

Core
Graphics: Text
4 years ago
3 years ago

People

(Reporter: vladan, Unassigned)

Tracking

(Depends on: 3 bugs, {meta})

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [Snappy:P1])

Attachments

(3 attachments, 1 obsolete attachment)

(Reporter)

Description

4 years ago
These are some top font chrome-hangs from Nightly 22 on Windows. See also bug 699331 and bug 734308

I am certain the attached hangs represent multiple bugs but I don't know the fonts codebase. John, can you help me split these up into individual bugs or point me to existing bugs?

Thanks
(Reporter)

Comment 1

4 years ago
Created attachment 734896 [details]
Frequent font chrome-hangs

Comment 2

4 years ago
The hangs listed here are neither bug 699331 or bug 734308, those are hangs within *our* code, these are hangs purely within OS code.

FontFileAnalyzer::IsPfm - this is the DirectWrite GetSystemFontCollection font system initialization routine, to do any text drawing you need to get a font from the system, so there's no way around this call.  There was a bug in early versions of Windows 7 in the way this communicated with the FontCache service, in some cases that communication would fail and the client code (i.e. code within our process) would decide to enumerate all fonts, causing a hang.  Microsoft claims this was fixed but I'm not so sure, we still see it show up in Telemetry metrics.  Not a "frequent" hang but one that's sucky just the same.

NtGdiAddFontMemResourceEx - this is loading a single font.  No idea why this takes so long.  Please see if you can come up with a testcase for this.

OpenTypeNameTable::OpenTypeNameTable - this is the first case, GetSystemFontCollection.

DWriteFactory::GetSharedFactory - another DirectWrite initialization call, no idea why this would take this long.  Need a testcase.

Just to emphasize the point here, these are either system initialization calls or places where the contents somehow cause a long delay within system library code.  There's no enumeration of resources involved here.

What sort of environment are you running these tests in?  Windows 7 with latest updates?  On a real machine?  Or this is provided via an automatic feedback mechanism?
(Reporter)

Comment 3

4 years ago
(In reply to John Daggett (:jtd) from comment #2)
> What sort of environment are you running these tests in?  Windows 7 with
> latest updates?  On a real machine?  Or this is provided via an automatic
> feedback mechanism?

These are real-world hangs, automatically reported by Nightly on Windows. There is a variety of Windows versions, hardware, and configurations in these reports. I can send you the data about the affected users' environments if you think it would be useful.

Comment 4

4 years ago
Based on data Vladan produced of chrome hangs within font/text code, I classified the hangs and summed together the total number of occurances for each:

names 22547 (bug 752394)
cmaps 13245 (bug 734308)
GetSystemFontCollection 7404 (bug 705287)
MakeTextRun 4161
MakePlatformFont 3510
gfxDWriteFontEntry::CreateFontFace 593
nsTextFrame::ReflowText 508
gfxFontFamily::ReadOtherFamilyNames 291 (bug 752394?)
DrawTextRun 285
gfxDWriteFontEntry::ReadCMAP 157 (bug 734308)
gfxDWriteFont::ComputeMetrics 116

The names and cmaps we can do something about.  The hangs in MakeTextRun must be on large pages where we spent a lot of time laying out text.  Ditto for the ReflowText, ComputeMetrics and DrawTextRun cases.  I'm a little concerned about why MakePlatformFont takes so long, that's the routine for instantiating downloadable fonts (as is CreateFontFace).

The DirectWrite GetSystemFontCollection calls are basically something we can't do anything about, other than put in logic to automatically downgrade users to GDI when GetSystemFontCollection is slow.  There were also bugs in the Microsoft implementation in early versions of Windows 7 and Vista such that it could take a really long time (>10secs).  But I'm pretty sure all of these cases are under cold startup shortly after a reboot, once the system font cache service has initialized this calls takes less that 1ms the vast majority of the time (and we have Telemetry data to back that up).  IE9+ suffers from the exact same problem.
Component: Graphics → Graphics: Text
Depends on: 705287, 752394, 734308, 699331
Keywords: meta
Summary: Top font chrome-hangs → eliminate top font chrome-hangs

Updated

4 years ago
Depends on: 860492
Depends on: 864445

Comment 5

3 years ago
Created attachment 8336575 [details]
chrome hang stacks, sorted by category

Based on Vladan's latest chrome hang data:

http://people.mozilla.org/~vdjeric/fontHangs_Nov_19_2013.txt

Similar to comment 4:
names 51047 (bug 752394)
cmaps 21170 (bug 734308)
GetSystemFontCollection 13068 (bug 705287)

The stacks below are all related to long-running reflow cycles. I'm assuming these are all within DoReflow calls (the stacks are limited so I can't confirm this).

BuildTextRunForFrames 14931
gfxFont::SplitAndInitTextRun 1090 
DWriteFont::CreateFontFace 1575
CanvasRenderingContext2D::SetFont 491
MakePlatformFont 109

non-reflow related:
paint 969
restyle 445 

not sure (need more info):
unknown 3553 
gfxUserFontSet::OnLoadComplete 4261

Comment 6

3 years ago
Created attachment 8343431 [details]
chrome hang stacks, sorted by category
Attachment #8336575 - Attachment is obsolete: true

Updated

3 years ago
Depends on: 947025

Updated

3 years ago
Depends on: 947812

Comment 7

3 years ago
Created attachment 8345188 [details]
fonthangs that occur outside of DoReflow

Data as of 9 Dec from Vladan, categorized by type.  Total counts for each hang:

21170 gfxPlatformFontList::RunLoader
 -- 10093 GDI
 -- 11077 DirectWrite
5192 gfxUserFontSet::OnLoadComplete
3032 unknown
1832 InitFaceNameLists
969 paint
542 GetSystemFontCollection
445 restyle
142 gfxPlatformFontList::GlobalFontFallback

Not sure why InitFaceNameLists has dropped down the list.  The RunLoader numbers should drop with the landing of bug 947025 which limits the timeslice of one pass to 100ms.  And bug 947812 will eliminate the need to do table reads on machines that support it (i.e. those with a recent version of DirectWrite).

Updated

3 years ago
Depends on: 962440
You need to log in before you can comment on or make changes to this bug.