"font-info-updated" notifications can cause an unexpected reframe during mochitests, which can cause intermittent test failures due to reframe-dependent state getting clobbered
Categories
(Core :: Graphics: Text, defect)
Tracking
()
People
(Reporter: dholbert, Unassigned)
References
Details
Attachments
(2 files)
3.00 KB,
patch
|
Details | Diff | Splinter Review | |
2.67 KB,
text/plain
|
Details |
STR:
- Run the included mochitest (essentially an extended variant of test_bug1785739.html ) on macOS (or Windows I think).
ACTUAL RESULTS:
The test fails at some arbitrary point.
EXPECTED RESULTS:
Test should pass.
I think the test is getting at the root issue in bug 1786088 (but in a way that's easier to repro).
:krosylight, maybe you can take a look? I suspect my test here (and bug 1786088) are exposing a real Gecko race condition of some sort. Hopefully you can repro fairly easily if you've got mac or windows available. (I've only tested locally on mac, and it's quite reliably reproducible there for me; and based on Try results, it seems to reproduce pretty reliably on Windows as well. But I haven't gotten it fail on Linux yet, so I'm not sure we'll be able to catch it in rr
.)
Reporter | ||
Comment 1•2 years ago
•
|
||
Here's the patch with the new test that I mentioned in the STR.
It populates an input element with 1000 iterations of "abc" and then attempts to find each of them in order.
(This test tries to be a bit more graceful at handling errors than test_bug1785739.html
-- if our position gets reset for some reason, this new test resets its expectations based on that new position. Otherwise every subtest after the first failure is trivially also a failure, and that just adds noise.)
Reporter | ||
Comment 2•2 years ago
|
||
Also, I should say: once we fix this bug and we're ready to land this test (or a version of it), we can probably lower its repeat-count, maybe to 20 or 50 or something like that.
I'm using 1000 repetitions right now since that makes the bug pretty-likely to occur, but it also makes the test's runtime ~tens of seconds, which would be wasteful to use in an actual landed test. (The actual count doesn't matter, I think; a higher-count just gives the bug a higher likelihood of being observed.)
Reporter | ||
Comment 3•2 years ago
|
||
Here's a Try run with the attached patch:
https://treeherder.mozilla.org/jobs?repo=try&revision=6c8097864e95082ce594483146850a8c6d395c7a
The TV
oranges are failures in this included test, and so are these various regular M-[bucketNumber]
oranges for macOS opt tasks:
https://treeherder.mozilla.org/logviewer?job_id=387980091&repo=try&lineNumber=2978
https://treeherder.mozilla.org/logviewer?job_id=387980097&repo=try&lineNumber=3195
https://treeherder.mozilla.org/logviewer?job_id=387980092&repo=try&lineNumber=3306
(I think there'll be failures on the equivalent windows mochitest tasks, too, though those haven't completed yet.)
Reporter | ||
Comment 4•2 years ago
|
||
(In the original patch-with-a-test that I posted, the new test is created as a hg-cp-with-modifications.
That might be troublesome if the test gets any edits/changes, though. To reduce hypothetical headache around that, here's a version with the test as a brand-new file instead.
Folks can use either of the attached patches to repro this bug; it doesn't matter which one, though this one might be more "portable" forward & backward in hg history.)
Reporter | ||
Updated•2 years ago
|
Reporter | ||
Updated•2 years ago
|
Reporter | ||
Comment 5•2 years ago
|
||
Looked at this on my own a bit, to satisfy my curiosity, and I think I figured it out.
It looks like shortly after the document loads (while the test is partway through its iterative traversal of all the matches), we receive a "font-info-updated" notification via the observer service, and that forces us to reframe (the whole document I think?) via this code:
https://searchfox.org/mozilla-central/rev/aeddc3f568de22ad445b4999713ab3a28c3b7a93/layout/base/PresShell.cpp#9904-9907
if (!nsCRT::strcmp(aTopic, "font-info-updated")) {
// See how gfxPlatform::ForceGlobalReflow encodes this.
bool needsReframe = aData && !!aData[0];
mPresContext->ForceReflowForFontInfoUpdate(needsReframe);
It's to-be-expected (though unfortunate) that reframing causes us to reset some of the page-state. And the precise find-in-page location is one of the things that we reset, I think.
So: immediately after we receive this notification, the next iteration of the testcase's loop gets its position reset to the start of the textfield, which causes a test failure.
Reporter | ||
Comment 6•2 years ago
|
||
emilio/jfkthame, see comment 5 (I think you both know a bit about this signal).
So the good news here is that this probably has little-to-no user impact. But it is a potential source of intermittent-failures (e.g. bug 1786088), so we should figure out some strategy to avoid it. (Maybe we already do?)
e.g. can mochitests suppress/delay this signal in some way, or detect the fact that it's been it's sent and gracefully handle that condition (e.g. restart/adjust their testing) to avoid intermittently failing?
Reporter | ||
Updated•2 years ago
|
Comment 7•2 years ago
|
||
If you set gfx.font_loader.delay to zero, does that affect the reproducibility?
Reporter | ||
Comment 8•2 years ago
•
|
||
Yes, that seems to avoid the issue - thanks!
I see we do that in https://searchfox.org/mozilla-central/rev/5f1fc7348a9fb1189fe1c2139d68f928f8cfe510/editor/spellchecker/tests/mochitest.ini#1-4 - probably we need that in this directory as well then, and that should address bug 1786088.
Reporter | ||
Comment 9•2 years ago
•
|
||
In particular, maybe copypasting this snippet would be the right / most-targeted thing to do here -- I'll try that:
https://searchfox.org/mozilla-central/rev/5f1fc7348a9fb1189fe1c2139d68f928f8cfe510/dom/events/test/test_bug1710509.html#33-41
Reporter | ||
Comment 10•2 years ago
|
||
Actually it looks like changing the pref value manually in the test is forbidden -- it triggers:
Assertion failure: staticPrefValue == preferenceValue (Preference 'gfx.font_loader.delay' got modified since StaticPrefs::gfx_font_loader_delay_At
Startup was initialized. Consider using analways
mirror kind instead), obj-debug-opt/dist/include/mozilla/StaticPrefList_gfx.h:400
We have two tests that attempt to do that (the one I found in comment 9 and one other), but it's not a problem for them because the pref is also set in their mochitest.ini directory, so they're not actually changing it.
I'll spin off a separate bug to remove that from those tests, since it has no effect and it mistakenly gives the impact that it's a valid way to set this pref.
Reporter | ||
Comment 11•2 years ago
|
||
I'm going to just dupe this over to bug 1786088 and post a patch (setting the prefs) over there.
(I don't think there's any reason to land a version of my posted STR-test-patch here at this point. My STR-test-patch here is essentially just good at sniffing out this "font-info-updated" event, and the test doesn't serve much purpose if we nerf/eagerly-flush that event.)
Reporter | ||
Comment 12•2 years ago
|
||
(In reply to Daniel Holbert [:dholbert] from comment #10)
Actually it looks like changing the pref value manually in the test is forbidden [...]
We have two tests that attempt to do that [...]
I'll spin off a separate bug to remove that from those tests
Spun off bug 1786209
Description
•