Closed Bug 1694123 Opened 5 years ago Closed 5 years ago

The font will fallback to Segoe UI with src:local() when the platform font list is reinitialized

Categories

(Core :: Layout: Text and Fonts, defect)

Firefox 87
x86_64
Windows 10
defect

Tracking

()

VERIFIED FIXED
89 Branch
Fission Milestone M7a
Tracking Status
firefox-esr78 --- wontfix
firefox86 --- wontfix
firefox87 --- wontfix
firefox88 --- wontfix
firefox89 --- verified
firefox90 --- verified

People

(Reporter: over68, Assigned: jfkthame)

References

(Regression)

Details

(Keywords: regression)

Attachments

(2 files)

Steps to reproduce:

  1. Download Font Loader.
  2. Download and extracts the archive.
  3. Open this page.
  4. Open the Font Loader, Click on the Add Fonts button, Select the extracted font files then click Open.
  5. Click on the Load button.

See https://youtu.be/5A0oqFNDCSc

Actual results:

The font will fallback to Segoe UI with src:local(...) when the platform font list is reinitialized, see screenshot.

I think this is a regression from bug 1545177 or bug 1580690.

Blocks: 1533462
Keywords: regression

:over68, since this bug is a regression, could you fill (if possible) the regressed_by field?
For more information, please visit auto_nag documentation.

Flags: needinfo?(over68)
Blocks: 1694174

OK, I finally understand what's happening here. When a new font is activated by Font Loader, we get a WM_FONTCHANGED notification from Windows, and that triggers rebuilding the platform font list. When the parent process has rebuilt its list, it notifies content processes to reflow their content using the new font collection.

The problem here arises because multiple fonts are being activated in quick succession, resulting in a series of WM_FONTCHANGED notifications. (They don't get batched.) And the content is very long, such that its reflow takes a substantial amount of time. So while the content process is reflowing in response to the first update, a new font-list update happens. This invalidates the old copy of the list that the content process is using, and its src:local() lookups start failing. (Specifically, the content process tries to send InitializeFamily messages to the parent for the families it wants to look at, but the generation field in the message doesn't match and so the parent discards them, and the content process sees an unusable font so it ignores it.)

Eventually, the last WM_FONTCHANGED notification will be received and handled, and content processes will again be told to reflow, so things should settle down to the correct state. But unfortunately, the gfxUserFontEntry records in the content proc's user font set have already tried to find their src:local() font, failed, and moved on to the next entry in their source list. (There isn't one, in this case, so the @font-face rules become effectively "dead"; they have no usable source.)

We try to ensure that src:local() fonts will be re-evaluated after a font list update; the bug is that in this case, where the rule did not load a local font (because of the transient mismatch in font-list generations), we fail to reset its state and so from then on, the rule won't work. We need to reset the loading state of all user font records in gfxUserFontSet::ForgetLocalFaces(): not only the ones that did load a local face but also other rules that may have tried and failed to do so.

Assignee: nobody → jfkthame
Severity: -- → S3
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(over68)
Regressed by: 1580690
Has Regression Range: --- → yes
Fission Milestone: --- → M7a
Pushed by jkew@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/bd78ccc10d70 Ensure user fonts with src:local() are refreshed after a font-list update, even if the local font lookup previously failed. r=jwatt

Backed out for causing browsertime failures

backout: https://hg.mozilla.org/integration/autoland/rev/abe438fff5bab3037733a133e1c6f7809541449e

push: https://treeherder.mozilla.org/jobs?repo=autoland&group_state=expanded&searchStr=linux%2C18.04%2Cx64%2Cwebrender%2Cshippable%2Copt%2Cbrowsertime%2Cperformance%2Ctests%2Con%2Cfirefox%2Ctest-linux1804-64-shippable-qr%2Fopt-browsertime-tp6-essential-firefox-google-mail-e10s%2Cgmail&revision=bd78ccc10d704a29bccc83a3edd977cd99f28027

failure log: https://treeherder.mozilla.org/logviewer?job_id=334602664&repo=autoland&lineNumber=1023

[task 2021-03-26T22:05:38.852Z] 22:05:38 INFO - raptor-browsertime Info: Start firefox window recorder.
[task 2021-03-26T22:05:44.340Z] 22:05:44 ERROR - raptor-browsertime Error: Browsertime failed to run
[task 2021-03-26T22:05:44.361Z] 22:05:44 INFO - raptor-browsertime Info: b' at Object.throwDecodedError (/home/cltbld/tasks/task_161679610263557/fetches/browsertime/node_modules/selenium-webdriver/lib/error.js:517:15)'
[task 2021-03-26T22:05:44.361Z] 22:05:44 INFO - raptor-browsertime Info: b' at parseHttpResponse (/home/cltbld/tasks/task_161679610263557/fetches/browsertime/node_modules/selenium-webdriver/lib/http.js:655:13)'
[task 2021-03-26T22:05:44.362Z] 22:05:44 INFO - raptor-browsertime Info: b' at Executor.execute (/home/cltbld/tasks/task_161679610263557/fetches/browsertime/node_modules/selenium-webdriver/lib/http.js:581:28)'
[task 2021-03-26T22:05:44.362Z] 22:05:44 INFO - raptor-browsertime Info: b' at process._tickCallback (internal/process/next_tick.js:68:7)'
[task 2021-03-26T22:05:44.362Z] 22:05:44 INFO - raptor-browsertime Info: URL failed to load, trying 5 more time(s): Failed waiting on page https://mail.google.com/ to finished loading, timed out after 300000 ms
[task 2021-03-26T22:05:44.363Z] 22:05:44 CRITICAL - raptor-browsertime Critical: Failed waiting on page https://mail.google.com/ to finished loading, timed out after 300000 ms WebDriverError: Failed to decode response from marionette
[task 2021-03-26T22:05:44.363Z] 22:05:44 INFO - raptor-mitmproxy Info: MitmproxyDesktop stop!!
[task 2021-03-26T22:05:44.363Z] 22:05:44 INFO - raptor-mitmproxy Info: Mitmproxy stop!!
[task 2021-03-26T22:05:44.363Z] 22:05:44 INFO - raptor-mitmproxy Info: Stopping mitmproxy playback, killing process 2407
[task 2021-03-26T22:05:44.489Z] 22:05:44 INFO - raptor-mitmproxy Info: Successfully killed the mitmproxy playback process
[task 2021-03-26T22:05:44.490Z] 22:05:44 INFO - raptor-mitmproxy Info: Turning off the browser proxy
[task 2021-03-26T22:05:44.491Z] 22:05:44 INFO - raptor-mitmproxy Info: writing: /home/cltbld/tasks/task_161679610263557/build/application/firefox/distribution/policies.json

Firstly seen as bulk failures which led to backfills stopping on ^

Flags: needinfo?(jfkthame)

Some browsertime failures showed up as a result of LoadCanceled() racing with the off-main-thread font sanitization, which could result in mSrcIndex being wrong by the time the sanitization finished and we go to cache the resulting font entry. Fixed by not resetting mSrcIndex until the start of the next loading request, instead of during LoadCanceled.

Flags: needinfo?(jfkthame)
Pushed by jkew@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/97c97bbc0691 Ensure user fonts with src:local() are refreshed after a font-list update, even if the local font lookup previously failed. r=jwatt
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → 89 Branch

Since the status are different for nightly and release, what's the status for beta?
For more information, please visit auto_nag documentation.

This is preffed-off for late beta/release, so not critical to fix there.

Regressions: 1702865
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: 89 Branch → ---
Flags: needinfo?(jfkthame)

Steps to reproduce bug 1702865:

  1. Download Font Loader.
  2. Download and extracts the archive.
  3. Open this page.
  4. Open the Font Loader, Click on the Add Fonts button, Select the extracted font files then click Open.
  5. Click on the Load button then Unload.

Actual results:

The tab crashed.

Crash report: bp-4d19a79b-0b1d-4202-818e-a0d5d0210406

Reason: EXCEPTION_ACCESS_VIOLATION_READ

Top 10 frames of crashing thread:

0 xul.dll static gfxUserFontSet::UserFontCache::CacheFont gfx/thebes/gfxUserFontSet.cpp:1215
1 xul.dll gfxUserFontEntry::LoadPlatformFont gfx/thebes/gfxUserFontSet.cpp:768
2 xul.dll gfxUserFontEntry::ContinuePlatformFontLoadOnMainThread gfx/thebes/gfxUserFontSet.cpp:879
3 xul.dll static mozilla::detail::RunnableMethodArguments<const unsigned char*, unsigned int, gfxUserFontType, const unsigned char*, unsigned int, nsTArray<gfxUserFontEntry::OTSMessage>&&, nsMainThreadPtrHandle<nsIFontLoadCompleteCallback> >::applyImpl<gfxUserFontEntry, void  xpcom/threads/nsThreadUtils.h:1148
4 xul.dll mozilla::detail::RunnableMethodImpl<gfxUserFontEntry*, void  xpcom/threads/nsThreadUtils.h:1201
5 xul.dll mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal xpcom/threads/TaskController.cpp:754
6 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1155
7 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:87
8 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:328
9 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:310

This is a preliminary step that is needed in order for the following patch to be safe. (It's
mostly just plumbing, although the patch looks big because we need to pass the index through
so many functions.)

The issue is that loading a font resource involves a couple of tasks that happen asynchronously -
fetching the data from the network, and decoding/sanitizing the font data. We have an implicit
assumption that once a font load has been initiated, the state of the associated gfxUserFontEntry
will not change; in particular, we depend on mSrcIndex remaining constant, because we may use it
to index into the mSrcList array and access the gfxFontFaceSrc record again in the various
completion tasks that are executed after the async steps finish.

But the following patch breaks this assumption, because it may reset mSrcIndex at arbitrary times
when we discover that we need to re-resolve the @font-face to potentially recognize a src:local()
resource that was earlier in the list, but was previously unavailable. If this happens while
a font-load is doing its off-main-thread work, then when it completes, it will end up accessing
the wrong gfxFontFaceSrc record, which would result at least in incorrect behavior, but can also
result in a crash (e.g. if the record is of the wrong type altogether, such as trying to use the
principal or URI fields from a record of type src:local).

To avoid this problem, we should pass the source index at the time the font load is initiated
through to the OMT tasks and back to their main-thread completion routines, so that we do not
depend on the field in the gfxUserFontEntry remaining frozen. This makes the loading process
safe even if the source index in the entry gets reset while loading tasks are in progress.

Pushed by jkew@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/43764851d067 Pass userfont source index to font loader and decoder tasks, to avoid potential race with the field in the gfxUserFontEntry being reset on the main thread. r=layout-reviewers,emilio https://hg.mozilla.org/integration/autoland/rev/4f9d4d1aaa3e Ensure user fonts with src:local() are refreshed after a font-list update, even if the local font lookup previously failed. r=jwatt
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
Target Milestone: --- → 89 Branch
Depends on: 1703614
Regressions: 1703614
QA Whiteboard: [qa-89b-p2]
Flags: needinfo?(jfkthame)

Hello,
I have reproduced the issue using STR from comment 0, on an affected Nightly build 2021-03-21.

The fix was verified using the latest Nightly 90.0a1 (2021-05-11) and Firefox 89.0b11 (2021-05-11) on Windows 10x64.
The issue is not reproducing anymore.

Thanks!

Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: