Closed Bug 1368531 Opened 7 years ago Closed 7 years ago

gfxPlatformFontList::InitOtherFamilyNames() seems to time out almost always

Categories

(Core :: Graphics: Text, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla57
Performance Impact high
Tracking Status
firefox57 --- fixed

People

(Reporter: mstange, Assigned: jerry)

References

Details

(Whiteboard: [gfx-noted])

Attachments

(1 file, 4 obsolete files)

The loop in gfxPlatformFontList::InitOtherFamilyNames() has a timeout of 200ms. It also reports telemetry for the time that this loop takes in the real world.

The data from that telemetry is here: https://mzl.la/2rdmK2F

The reported median time is 194.79ms, which looks a lot like we're hitting this timeout all the time in practice.

When does this code run? During startup, or asynchronously? Does it run on the parent process or just in the content process?

Can we give it a shorter timeout, and slice up the work it does? It currently doesn't seem to have a way to continue the loop after a timeout. Could we use the idle queue for this work?
Flags: needinfo?(jfkthame)
Blocks: 752394
Whiteboard: [qf]
Whiteboard: [qf] → [qf][gfx-note]
Whiteboard: [qf][gfx-note] → [qf][gfx-noted]
(In reply to Markus Stange [:mstange] from comment #0)
> The loop in gfxPlatformFontList::InitOtherFamilyNames() has a timeout of
> 200ms. It also reports telemetry for the time that this loop takes in the
> real world.
> 
> The data from that telemetry is here: https://mzl.la/2rdmK2F
> 
> The reported median time is 194.79ms, which looks a lot like we're hitting
> this timeout all the time in practice.
> 
> When does this code run? During startup, or asynchronously? Does it run on
> the parent process or just in the content process?
> 
> Can we give it a shorter timeout, and slice up the work it does? It
> currently doesn't seem to have a way to continue the loop after a timeout.
> Could we use the idle queue for this work?

IIUC, this shouldn't be happening in the startup path, it would be in response to a web page requesting a font by localized name, before the (delayed, async) font info loader has completed. In that case, the code here will kick in to try and load the localized font names in order to support the font lookup. It has a timeout because that can sometimes be slow, but if it hits the timeout, then it will continue next time such a request happens (or in any case, the delayed font loader will eventually fill things in anyhow).

So AFAICT this is unlikely to be happening in the chrome process in most cases, as chrome is unlikely to be requesting fonts via their localized names; it's probably a content-process thing. But I don't think we currently have telemetry to show that for sure.

What might be most interesting here would be to see if we can rework the whole async font loader thing to work on the idle queue, instead of (as it currently does) at a specific interval after startup; ideally, by using idle we might be able to get it to complete more quickly in most cases, at the same time as avoiding contention with more important work in the case where startup is slow and expensive for other reasons.
Flags: needinfo?(jfkthame)
Whiteboard: [qf][gfx-noted] → [qf:p1][gfx-noted]
Adding to qf-bugs-upforgrabs assuming this bug only involves moving the operation to idle queue.
Assignee: nobody → hshih
Status: NEW → ASSIGNED
I still can't find a example to reach the call gfxPlatformFontList::InitOtherFamilyNames().
Is this call running at main thread?

The gfxPlatformFontList inherits from gfxFontInfoLoader[1]. And gfxFontInfoLoader has the ability for off-main-thread font loading.

[1]
https://dxr.mozilla.org/mozilla-central/rev/7d2e89fb92331d7e4296391213c1e63db628e046/gfx/thebes/gfxPlatformFontList.h#91

I could just create a runnable object which contains the task for InitOtherFamilyNames() and use NS_IdleDispatchToCurrentThread() to dispatch to the idle queue. But if the font loading is already off-main, we could skip this task.
Flags: needinfo?(jfkthame)
InitOtherFamilyNames() may be called from gfxPlatformFontList::FindAndAddFamilies[1] if a non-ASCII font-family name (e.g. a localized Japanese font name) is used in CSS.  If we encounter a non-ASCII font-family name early in the session (e.g. when restoring the current tab from the user's previous session), the off-main-thread loader probably won't have done its work yet and so we'll hit this code path.

[1] https://dxr.mozilla.org/mozilla-central/rev/7d2e89fb92331d7e4296391213c1e63db628e046/gfx/thebes/gfxPlatformFontList.cpp#725
Flags: needinfo?(jfkthame)
Add a boolean value in FindFamily() to check if gecko needs to defer the
InitOtherFamilyNames() task into idle queue.

Add a new telemetry data FONTLIST_INITOTHERFAMILYNAMES_NO_DEFERRING for the
execution time of InitOtherFamilyNames() running regularly at main thread
instead of in idle queue.

MozReview-Commit-ID: A3YPDcHtXrX
Attachment #8890916 - Flags: review?(jfkthame)
MozReview-Commit-ID: GjIEhUSpwKt
Attachment #8890917 - Flags: review?(jfkthame)
Comment on attachment 8890916 [details] [diff] [review]
P1: put the task gfxPlatformFontList::InitOtherFamilyNames() into idle queue.

Review of attachment 8890916 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/thebes/gfxPlatformFontList.cpp
@@ +311,5 @@
>      if (mOtherFamilyNamesInitialized) {
>          return;
>      }
>  
> +    if (aDeferOtherFamilyNamesLoading) {

Post a task to idle queue instead of just execute with timeout setting.

@@ +312,5 @@
>          return;
>      }
>  
> +    if (aDeferOtherFamilyNamesLoading) {
> +        if (!mPendingOtherFamilyNameTask) {

If we already post the pending loading task, we don't need to post the same task again.

@@ +1507,5 @@
>      }
>  
>      if (mOtherNamesMissed) {
>          for (auto it = mOtherNamesMissed->Iter(); !it.Done(); it.Next()) {
> +            if (FindFamily(it.Get()->GetKey(), false)) {

This is the final position for all unloaded fonts. So, we pass "DeferOtherFamilyNamesLoading=false" here.

@@ +1683,5 @@
> +    if (mOtherFamilyNamesInitialized) {
> +        return;
> +    }
> +
> +    if (aDeferOtherFamilyNamesLoading) {

This code section is running in the idle queue, but we still use the original timeout setting to yield the cpu time for the long font loading.

@@ +1712,5 @@
> +                          elapsed.ToMilliseconds(),
> +                          (timedOut ? "timeout" : "")));
> +        }
> +    } else {
> +        TimeStamp start = TimeStamp::Now();

This code block is used in gfxPlatformFontList::CleanupLoader(). That's the final chance to load the unloaded fonts. So, there is no timeout setting here. Just load all remaining fonts.
Comment on attachment 8890916 [details] [diff] [review]
P1: put the task gfxPlatformFontList::InitOtherFamilyNames() into idle queue.

Review of attachment 8890916 [details] [diff] [review]:
-----------------------------------------------------------------

r=me, just with a few nits/questions.

::: gfx/thebes/gfxPlatformFontList.cpp
@@ +181,5 @@
>        mStartIndex(0), mIncrement(1), mNumFamilies(0), mFontlistInitCount(0),
>        mFontFamilyWhitelistActive(false)
>  {
>      mOtherFamilyNamesInitialized = false;
> +    CancelInitOtherFamilyNamesTask();

Is this needed? I don't see how there could be a task already in existence when we're running the gfxPlatformFontList constructor...

@@ +269,5 @@
>  
>      mFontFamilies.Clear();
>      mOtherFamilyNames.Clear();
>      mOtherFamilyNamesInitialized = false;
> +    CancelInitOtherFamilyNamesTask();

Logically, I think the task should be cancelled before the list is cleared here. (In practice it shouldn't matter, as I presume the idle task can't suddenly run in between the statements here, but it seems more correct that way.)

::: gfx/thebes/gfxPlatformFontList.h
@@ +138,5 @@
>      // Return true if any match was found and appended, false if none.
>      virtual bool
>      FindAndAddFamilies(const nsAString& aFamily,
>                         nsTArray<gfxFontFamily*>* aOutput,
> +                       bool aDeferOtherFamilyNamesLoading = true,

Let's not give this param a default value. In this case I think it's straightforward to have all callsites explicitly pass the value they want to use.

@@ +271,5 @@
> +        InitOtherFamilyNamesRunnable()
> +            : CancelableRunnable("gfxPlatformFontList::InitOtherFamilyNamesRunnable")
> +            , mIsCanceled(false)
> +        {
> +

no need for a blank line here

@@ +277,5 @@
> +
> +        NS_IMETHOD Run() override
> +        {
> +            if (mIsCanceled) {
> +              return NS_OK;

nit: indentation in this file is 4-space

@@ +358,5 @@
>      // Convenience method to return the first matching family (if any) as found
>      // by FindAndAddFamilies().
>      gfxFontFamily*
> +    FindFamily(const nsAString& aFamily,
> +               bool aDeferOtherFamilyNamesLoading = true,

FWIW, I'm OK with having the default value here, as we do have a bunch of callsites that want to do a simple FindFamily(<name>) and not think about anything more complex.
Attachment #8890916 - Flags: review?(jfkthame) → review+
Comment on attachment 8890917 [details] [diff] [review]
P2: update FindAndAddFamilies() usage for all platforms.

Review of attachment 8890917 [details] [diff] [review]:
-----------------------------------------------------------------

r=me. I think you should fold the two patches together for landing, as it looks like the intermediate step (only patch 1) would fail to compile, and we prefer not to have unbuildable revisions in the tree if we can avoid it.

::: gfx/thebes/gfxDWriteFontList.h
@@ +378,5 @@
>      bool UseGDIFontTableAccess() { return mGDIFontTableAccess; }
>  
>      bool FindAndAddFamilies(const nsAString& aFamily,
>                              nsTArray<gfxFontFamily*>* aOutput,
> +                            bool aDeferOtherFamilyNamesLoading = true,

Like in the previous patch, please drop the default "= true" from this param (in all the overrides).
Attachment #8890917 - Flags: review?(jfkthame) → review+
Comment on attachment 8890916 [details] [diff] [review]
P1: put the task gfxPlatformFontList::InitOtherFamilyNames() into idle queue.

Review of attachment 8890916 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/thebes/gfxPlatformFontList.cpp
@@ +181,5 @@
>        mStartIndex(0), mIncrement(1), mNumFamilies(0), mFontlistInitCount(0),
>        mFontFamilyWhitelistActive(false)
>  {
>      mOtherFamilyNamesInitialized = false;
> +    CancelInitOtherFamilyNamesTask();

No, there is no need to have this one in constructor.

@@ +269,5 @@
>  
>      mFontFamilies.Clear();
>      mOtherFamilyNames.Clear();
>      mOtherFamilyNamesInitialized = false;
> +    CancelInitOtherFamilyNamesTask();

I will reorder this call.

::: gfx/thebes/gfxPlatformFontList.h
@@ +138,5 @@
>      // Return true if any match was found and appended, false if none.
>      virtual bool
>      FindAndAddFamilies(const nsAString& aFamily,
>                         nsTArray<gfxFontFamily*>* aOutput,
> +                       bool aDeferOtherFamilyNamesLoading = true,

I will update the function prototypes.
Attachment #8890916 - Attachment is obsolete: true
Attachment #8890917 - Attachment is obsolete: true
Attachment #8892334 - Attachment is obsolete: true
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e1d538cd1cf8
Put the task gfxPlatformFontList::InitOtherFamilyNames() into idle queue. r=jfkthame
Keywords: checkin-needed
Backed out for bustage on Windows: 'gfxPlatformFontList::InitOtherFamilyNamesRunnable::Cancel': overriding virtual function differs from 'mozilla::CancelableRunnable::Cancel' only by calling convention:

https://hg.mozilla.org/integration/mozilla-inbound/rev/d170cb10e24c5d5892e7dd937dadfa52e12f89cb

Push with bustage: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=e1d538cd1cf8a123a051224f6d73449a523cf99b&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel
Build log: https://treeherder.mozilla.org/logviewer.html#?job_id=120465923&repo=mozilla-inbound

18:39:56     INFO -  Unified_cpp_layout_style3.cpp
18:39:56     INFO -  z:\build\build\src\obj-firefox\dist\include\gfxPlatformFontList.h(293): error C2695: 'gfxPlatformFontList::InitOtherFamilyNamesRunnable::Cancel': overriding virtual function differs from 'mozilla::CancelableRunnable::Cancel' only by calling convention
18:39:56     INFO -  z:\build\build\src\obj-firefox\dist\include\nsThreadUtils.h(455): note: see declaration of 'mozilla::CancelableRunnable::Cancel'
18:39:56     INFO -  z:/build/build/src/config/rules.mk:1050: recipe for target 'Unified_cpp_layout_style3.obj' failed
18:39:56     INFO -  mozmake.EXE[5]: *** [Unified_cpp_layout_style3.obj] Error 2
Flags: needinfo?(hshih)
Attachment #8892342 - Attachment is obsolete: true
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a9acfba1bc3c
Put the task gfxPlatformFontList::InitOtherFamilyNames() into idle queue. r=jfkthame
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/a9acfba1bc3c
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Performance Impact: --- → P1
Whiteboard: [qf:p1][gfx-noted] → [gfx-noted]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: