Fast repeated spellcheck dictionary updates fail assertions in `SetFallbackDictionary`
Categories
(Core :: Spelling checker, defect, P3)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox108 | --- | affected |
People
(Reporter: Gijs, Assigned: aiunusov)
References
Details
SetFallbackDictionary asserts that mUpdateDictionaryRunning. This assumes that only 1 update can run at a time. This assertion fails if two updates are made to run: https://treeherder.mozilla.org/jobs?repo=try&revision=b714c24881ec2c1791bfa5d31b713ee19a15f625&selectedTaskRun=J_dd3AIJRyGQV6ohzom4xQ.0 .
Logging in BeginUpdateDictionary and EndUpdateDictionary produces:
0:31.19 GECKO(4140) Started updating dict.
0:31.19 GECKO(4140) Started updating dict.
0:31.19 GECKO(4140) Finished updating dict.
After which we assert and thus crash the debug build.
I am not very familiar with this code, but from a brief look it would seem that:
- https://searchfox.org/mozilla-central/rev/73ff5f1ac45202c4621a1bcee4b35d20027e71ad/editor/spellchecker/EditorSpellCheck.cpp#728-729,770-771 is an interface-exposed method that doesn't protect against re-entry
- that method will initialize a
DictionaryFetcherwith an async completion callback - once the fetcher completes we end up in the
DictionaryFetchedcallback, which does have a re-entry protection: https://searchfox.org/mozilla-central/rev/73ff5f1ac45202c4621a1bcee4b35d20027e71ad/editor/spellchecker/EditorSpellCheck.cpp#827-835 which says:
if (aFetcher->mGroup < mDictionaryFetcherGroup) {
// SetCurrentDictionary was called after the fetch started. Don't overwrite
// that dictionary with the fetched one.
EndUpdateDictionary();
if (aFetcher->mCallback) {
aFetcher->mCallback->EditorSpellCheckDone();
}
return NS_OK;
}
- if the second attempt to update the dictionary (which also hits this code but where
mGroup < mDictionaryFetcherGroupis not the case) then continues down this function and fails, it will invokeSetFallbackDictionaryhere and that will assert.
It's not 100% clear to me what the correct solution is supposed to be here, because although there is a comment about the point of mUpdateDictionaryRunning at https://searchfox.org/mozilla-central/rev/73ff5f1ac45202c4621a1bcee4b35d20027e71ad/editor/spellchecker/EditorSpellCheck.cpp#595-599 , I can't say I really follow it. So I can't tell which of the following (non-exhaustive) list of options we should pursue:
- get rid of the assert
- make
mUpdateDictionaryRunninga nesting level counter instead of a bool (so we validate we have an equal number of entries and exits) - avoid starting a second parallel call to
UpdateCurrentDictionary
:dminor or :adw, can you help?
| Reporter | ||
Comment 1•3 years ago
|
||
Meant to needinfo for comment 0.
| Reporter | ||
Comment 2•3 years ago
|
||
(note that I'm not 100% sure that the scenario/steps in comment 0 are what is causing the bug, because attempting to do stack logging from the begin/end update calls slowed things down sufficiently to stop reproducing the TV bug on my machine. 🙃)
Comment 3•3 years ago
|
||
It's been many years since I worked on this and I don't have the context anymore, so I can't be much help without spending a lot of time on it. I don't remember anything about who calls what and so forth. I did write the first sentence of that comment though, which I guess means if DictionaryFetched() calls that method, then mUpdateDictionaryRunning is true, so we can avoid doing a bunch of work. It sounds like an optimization. I don't know about the second sentence.
Comment 4•2 years ago
|
||
Although I made some changes around this code for the multiple dictionary support, I didn't look at the logic here in detail, so I don't think I can be much help on this either.
Updated•2 years ago
|
Updated•2 years ago
|
| Assignee | ||
Updated•2 years ago
|
| Assignee | ||
Comment 5•2 years ago
|
||
I could not reproduce this issue,
but suppose we can simply check if mUpdateDictionaryRunning at top level method UpdateCurrentDictionary(), and return NS_OK (to prevent calling it twice):
https://searchfox.org/mozilla-central/source/editor/spellchecker/EditorSpellCheck.cpp#729
I'm not sure about the consistency of the mUpdateDictionaryRunning variable, I see at least an attempt to do it correctly, but this class does not seem to be used (dead code): https://searchfox.org/mozilla-central/source/editor/spellchecker/EditorSpellCheck.cpp#58
| Assignee | ||
Comment 6•2 years ago
|
||
I guess I know ho to fix it. But still could not reproduce it.
If you remember how to reproduce the issue, I would appreciate it.
| Reporter | ||
Comment 7•2 years ago
|
||
(In reply to Artur Iunusov from comment #6)
I guess I know ho to fix it. But still could not reproduce it.
If you remember how to reproduce the issue, I would appreciate it.
Remove the skip-if at https://searchfox.org/mozilla-central/rev/9de332d5c8faac58dc1232b8a6383ce6cb1400f4/browser/base/content/test/contextMenu/browser.ini#50-52 and run ./mach test browser_contextmenu_spellcheck.js on a debug build should do it, if nothing has changed since October. Does that reproduce for you or not? :-)
Comment 8•2 years ago
|
||
I can't really remember this code. What would it mean if we returned early in UpdateCurrentDictionary() ? What would happen to that callback parameter? Someone needs to call it at some point if the method returns NS_OK.
Could we somehow serialize the calls? Could EditorSpellCheck have a list/array for pending updates and process those one by one?
Or could the method return a Promise? And it would return the same promise until a fetch is done. One would just call .then() on it to get the callback.
I don't quite understand the need for multiple call to the method though.
(Feel free to file a new bug about that dead code ;) )
| Assignee | ||
Comment 9•2 years ago
|
||
(In reply to :Gijs (he/him) from comment #7)
(In reply to Artur Iunusov from comment #6)
I guess I know ho to fix it. But still could not reproduce it.
If you remember how to reproduce the issue, I would appreciate it.Remove the skip-if at https://searchfox.org/mozilla-central/rev/9de332d5c8faac58dc1232b8a6383ce6cb1400f4/browser/base/content/test/contextMenu/browser.ini#50-52 and run
./mach test browser_contextmenu_spellcheck.json a debug build should do it, if nothing has changed since October. Does that reproduce for you or not? :-)
Thanks for the hint! I guess this test should help to reproduce.
| Assignee | ||
Comment 10•2 years ago
•
|
||
Update: after multiple test runs, still could not reproduce assert.
(Just double checked MOZ_ASSERT(false) leads to failure, so, I have a proper debug build)
| Reporter | ||
Comment 11•2 years ago
|
||
(In reply to Artur Iunusov from comment #10)
Update: after multiple test runs, still could not reproduce assert.
(Just double checked MOZ_ASSERT(false) leads to failure, so, I have a proper debug build)
Huh, yeah. Looks like try is happy too: https://treeherder.mozilla.org/jobs?repo=try&revision=e555604427a9beecd417369fe0641128036bf676&selectedTaskRun=K8P3oyuITgiB3UjfaFum2Q.0
Maybe the debug skip-if can just be removed at this point? Part of me would like to understand what fixed it, but doing the bisection seems potentially not worth it.
| Assignee | ||
Comment 12•9 months ago
|
||
need to remove debug skip-if and check again
| Assignee | ||
Comment 13•7 months ago
•
|
||
./mach test ./browser/base/content/test/contextMenu/browser_contextmenu_spellcheck.js
mochitest-browser
Ran 689 checks (688 subtests, 1 tests)
Expected results: 689
Unexpected results: 0
OK
cat mozconfig
ac_add_options --enable-debug
| Assignee | ||
Comment 14•7 months ago
|
||
Results on my machine (which is Windows 11 on core i9)
But according to the code I understand why it could fail after enabling it.
So, I agree to remove an assert and enable the test or use int counter insted of bool
Description
•