Closed Bug 1786170 Opened 2 years ago Closed 2 years ago

"font-info-updated" notifications can cause an unexpected reframe during mochitests, which can cause intermittent test failures due to reframe-dependent state getting clobbered

Categories

(Core :: Graphics: Text, defect)

defect

Tracking

()

RESOLVED DUPLICATE of bug 1786088

People

(Reporter: dholbert, Unassigned)

References

Details

Attachments

(2 files)

STR:

  1. Run the included mochitest (essentially an extended variant of test_bug1785739.html ) on macOS (or Windows I think).

ACTUAL RESULTS:
The test fails at some arbitrary point.

EXPECTED RESULTS:
Test should pass.

I think the test is getting at the root issue in bug 1786088 (but in a way that's easier to repro).

:krosylight, maybe you can take a look? I suspect my test here (and bug 1786088) are exposing a real Gecko race condition of some sort. Hopefully you can repro fairly easily if you've got mac or windows available. (I've only tested locally on mac, and it's quite reliably reproducible there for me; and based on Try results, it seems to reproduce pretty reliably on Windows as well. But I haven't gotten it fail on Linux yet, so I'm not sure we'll be able to catch it in rr.)

Flags: needinfo?(krosylight)

Here's the patch with the new test that I mentioned in the STR.

It populates an input element with 1000 iterations of "abc" and then attempts to find each of them in order.

(This test tries to be a bit more graceful at handling errors than test_bug1785739.html -- if our position gets reset for some reason, this new test resets its expectations based on that new position. Otherwise every subtest after the first failure is trivially also a failure, and that just adds noise.)

Also, I should say: once we fix this bug and we're ready to land this test (or a version of it), we can probably lower its repeat-count, maybe to 20 or 50 or something like that.

I'm using 1000 repetitions right now since that makes the bug pretty-likely to occur, but it also makes the test's runtime ~tens of seconds, which would be wasteful to use in an actual landed test. (The actual count doesn't matter, I think; a higher-count just gives the bug a higher likelihood of being observed.)

Here's a Try run with the attached patch:
https://treeherder.mozilla.org/jobs?repo=try&revision=6c8097864e95082ce594483146850a8c6d395c7a

The TV oranges are failures in this included test, and so are these various regular M-[bucketNumber] oranges for macOS opt tasks:
https://treeherder.mozilla.org/logviewer?job_id=387980091&repo=try&lineNumber=2978
https://treeherder.mozilla.org/logviewer?job_id=387980097&repo=try&lineNumber=3195
https://treeherder.mozilla.org/logviewer?job_id=387980092&repo=try&lineNumber=3306

(I think there'll be failures on the equivalent windows mochitest tasks, too, though those haven't completed yet.)

(In the original patch-with-a-test that I posted, the new test is created as a hg-cp-with-modifications.

That might be troublesome if the test gets any edits/changes, though. To reduce hypothetical headache around that, here's a version with the test as a brand-new file instead.

Folks can use either of the attached patches to repro this bug; it doesn't matter which one, though this one might be more "portable" forward & backward in hg history.)

Attachment #9290771 - Attachment description: patch with mochitest to demonstrate bug → patch with mochitest to demonstrate bug (creating the mochitest as a "hg cp" with modifications)
Attachment #9290774 - Attachment description: alternate version of patch, with the new test as a brand-new file instead of "hg cp" → alternate patch with mochitest to demonstrate bug (creating the mochitest as a brand new file instead of a copy)

Looked at this on my own a bit, to satisfy my curiosity, and I think I figured it out.

It looks like shortly after the document loads (while the test is partway through its iterative traversal of all the matches), we receive a "font-info-updated" notification via the observer service, and that forces us to reframe (the whole document I think?) via this code:
https://searchfox.org/mozilla-central/rev/aeddc3f568de22ad445b4999713ab3a28c3b7a93/layout/base/PresShell.cpp#9904-9907

if (!nsCRT::strcmp(aTopic, "font-info-updated")) {
  // See how gfxPlatform::ForceGlobalReflow encodes this.
  bool needsReframe = aData && !!aData[0];
  mPresContext->ForceReflowForFontInfoUpdate(needsReframe);

It's to-be-expected (though unfortunate) that reframing causes us to reset some of the page-state. And the precise find-in-page location is one of the things that we reset, I think.

So: immediately after we receive this notification, the next iteration of the testcase's loop gets its position reset to the start of the textfield, which causes a test failure.

Flags: needinfo?(krosylight)

emilio/jfkthame, see comment 5 (I think you both know a bit about this signal).

So the good news here is that this probably has little-to-no user impact. But it is a potential source of intermittent-failures (e.g. bug 1786088), so we should figure out some strategy to avoid it. (Maybe we already do?)

e.g. can mochitests suppress/delay this signal in some way, or detect the fact that it's been it's sent and gracefully handle that condition (e.g. restart/adjust their testing) to avoid intermittently failing?

Component: DOM: Selection → Graphics: Text
Summary: Mochitest test_bug1785739.html reliably fails on macOS and Windows if it's extended/hardened → If "font-info-updated" notifications cause random reframes during mochitests, which can cause intermittent test failures due to reframe-dependent state getting clobbered
Summary: If "font-info-updated" notifications cause random reframes during mochitests, which can cause intermittent test failures due to reframe-dependent state getting clobbered → "font-info-updated" notifications can cause an unexpected reframe during mochitests, which can cause intermittent test failures due to reframe-dependent state getting clobbered

If you set gfx.font_loader.delay to zero, does that affect the reproducibility?

Yes, that seems to avoid the issue - thanks!

I see we do that in https://searchfox.org/mozilla-central/rev/5f1fc7348a9fb1189fe1c2139d68f928f8cfe510/editor/spellchecker/tests/mochitest.ini#1-4 - probably we need that in this directory as well then, and that should address bug 1786088.

In particular, maybe copypasting this snippet would be the right / most-targeted thing to do here -- I'll try that:
https://searchfox.org/mozilla-central/rev/5f1fc7348a9fb1189fe1c2139d68f928f8cfe510/dom/events/test/test_bug1710509.html#33-41

Actually it looks like changing the pref value manually in the test is forbidden -- it triggers:

Assertion failure: staticPrefValue == preferenceValue (Preference 'gfx.font_loader.delay' got modified since StaticPrefs::gfx_font_loader_delay_At
Startup was initialized. Consider using an always mirror kind instead), obj-debug-opt/dist/include/mozilla/StaticPrefList_gfx.h:400

We have two tests that attempt to do that (the one I found in comment 9 and one other), but it's not a problem for them because the pref is also set in their mochitest.ini directory, so they're not actually changing it.

I'll spin off a separate bug to remove that from those tests, since it has no effect and it mistakenly gives the impact that it's a valid way to set this pref.

I'm going to just dupe this over to bug 1786088 and post a patch (setting the prefs) over there.

(I don't think there's any reason to land a version of my posted STR-test-patch here at this point. My STR-test-patch here is essentially just good at sniffing out this "font-info-updated" event, and the test doesn't serve much purpose if we nerf/eagerly-flush that event.)

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → DUPLICATE

(In reply to Daniel Holbert [:dholbert] from comment #10)

Actually it looks like changing the pref value manually in the test is forbidden [...]
We have two tests that attempt to do that [...]
I'll spin off a separate bug to remove that from those tests

Spun off bug 1786209

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: