Closed Bug 1687263 Opened 8 months ago Closed 4 months ago

spelling checker underline flipping/blinking, 1 core at 100%

Categories

(Core :: Spelling checker, defect, P2)

defect

Tracking

()

VERIFIED FIXED
90 Branch
Tracking Status
firefox90 --- verified

People

(Reporter: aryx, Assigned: mbrodesser)

References

Details

Attachments

(7 files)

Firefox 86.0a1 20210117214903 on Windows 8.1
Not a new regression (could also reproduce with a Nightly from 2020-07-10).

Under some circumstances, the spelling checker underline blinks instead of resting a a solid waved red underline. This causes the thread to use 1 cpu core at 100%.
Profile recording: https://share.firefox.dev/3sAucUY

Steps to reproduce:

  1. Open https://crash-stats.mozilla.org/report/index/6727ca76-934b-457c-b575-10c490210116#tab-bugzilla
  2. Click on "Bugzilla" tab.
  3. Click "Core".
  4. Click into the description area.

The blinking underline slowly moves from underlined text to the next one but can keep a single underline blinking for several seconds

hmm, I can't reproduce.
I wonder what is happening here.

Can you reproduce this reliably?

Flags: needinfo?(aryx.bugmail)

Yes. Also seeing it basically daily when I report new crashes or create bugs for intermittent failures.

Flags: needinfo?(aryx.bugmail)

Do you have any addons which might somehow tweak selection?

This reproduces with new profiles on Windows 8.1 (tested 32-bit and 64-bit Nightly builds) and on Windows 10 (tested only 64-bit). For the latter, reproducing took more attempts with changing focus away from the textarea and back to it (6-10) while Windows 8.1 reproduces it very quickly (1-2 attempts; I edit bug 1692160 comment 0 and click left of the textarea and back into it). The frequency of the underline blinking looks slower on Windows 10 but this might just be the refresh rate and the spellcheck status update being more aligned.

Mirko, can you reproduce this? And if yes, could you perhaps check what changes some selection or range object
(I think that might be happening).

Flags: needinfo?(mbrodesser)

Can reproduce the flickering, it happens after the first click into the textarea, perhaps multiple clicks prevent this from happening. But no CPU Core is at 100%.

Used Ubuntu 18.04.

I'll first finish the other tickets I'm currently working on. Can have a look again afterwards.

Flags: needinfo?(mbrodesser)
Flags: needinfo?(mbrodesser)
Assignee: nobody → mbrodesser
Flags: needinfo?(mbrodesser)
Attached file x.html

Smaller example.

See Also: → 1502661
Depends on: 1698786
Severity: -- → S3
Priority: -- → P2

The issue occurs with and without Fission on Nightly. I can't reproduce it with Firefox 86.0 (release).

If this is a regression, bi-secting it seems not promising, because the issue doesn't occur reliably.

Trying to find the regression range might then be rather useful still.

With a debug build, mozInlineSpellChecker sometimes can just check five words per time-slice:

[Child 20327: Main Thread]: D/InlineSpellChecker BuildSoftText: got DOM string: _yuv gfx/wr/swgl/src/composite.h:8531 xul.dll CompositeYUV gfx/wr/swgl/src/composite.h:9152 xul.dll webrender_bindings::swgl_bindings::SwCompositeThread::process_job gfx/webrender_bindings/src/swgl_bindings.rs:7493 xul.dll std::sys_common::backtrace::__rust_begin_short_backtrace<closure-0, tuple<>> ../e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:1314 xul.dll core::ops::function::FnOnce::call_once<closure-0, tuple<>> ../e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/ops/function.rs:2275 xul.dll std::sys::windows::thread::{{impl}}::new::thread_start ../e1884a8e3c3e813aada8254edfa120e85bf5ffca//library/std/src/sys/windows/thread.rs:566 kernel32.dll BaseThreadInitThunk7 mozglue.dll patched_BaseThreadInitThunk mozglue/dllservices/WindowsDllBlocklist.cpp:5918 ntdll.dll RtlUserThreadStart9 kernelbase.dll TerminateProcessOnMemoryExhaustion```  
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: mNextWordIndex=0
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: returning: yuv (skip=0)
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: got word "yuv"
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: removing ranges for some interval.
[Child 20327: Main Thread]: D/InlineSpellChecker RemoveRange
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: mNextWordIndex=1
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: returning: gfx (skip=0)
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: got word "gfx"
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: removing ranges for some interval.
[Child 20327: Main Thread]: D/InlineSpellChecker RemoveRange
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: mNextWordIndex=2
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: returning: wr (skip=0)
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: got word "wr"
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: removing ranges for some interval.
[Child 20327: Main Thread]: D/InlineSpellChecker RemoveRange
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: mNextWordIndex=3
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: returning: swgl (skip=0)
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: got word "swgl"
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: removing ranges for some interval.
[Child 20327: Main Thread]: D/InlineSpellChecker RemoveRange
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: mNextWordIndex=4
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: returning: src (skip=0)
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: got word "src"
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: removing ranges for some interval.
[Child 20327: Main Thread]: D/InlineSpellChecker RemoveRange
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: mNextWordIndex=5
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: returning: composite (skip=0)
[Child 20327: Main Thread]: V/InlineSpellChecker DoSpellCheck: we have run out of time, schedule next round.
[Child 20327: Main Thread]: D/InlineSpellChecker ScheduleSpellCheck
[Child 20327: Main Thread]: D/InlineSpellChecker ResumeCheck
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck
[Child 20327: Main Thread]: D/InlineSpellChecker BuildSoftText: got DOM string: _yuv gfx/wr/swgl/src/composite.h:8531 xul.dll CompositeYUV gfx/wr/swgl/src/composite.h:9152 xul.dll webrender_bindings::swgl_bindings::SwCompositeThread::process_job gfx/webrender_bindings/src/swgl_bindings.rs:7493 xul.dll std::sys_common::backtrace::__rust_begin_short_backtrace<closure-0, tuple<>> ../e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:1314 xul.dll core::ops::function::FnOnce::call_once<closure-0, tuple<>> ../e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/ops/function.rs:2275 xul.dll std::sys::windows::thread::{{impl}}::new::thread_start ../e1884a8e3c3e813aada8254edfa120e85bf5ffca//library/std/src/sys/windows/thread.rs:566 kernel32.dll BaseThreadInitThunk7 mozglue.dll patched_BaseThreadInitThunk mozglue/dllservices/WindowsDllBlocklist.cpp:5918 ntdll.dll RtlUserThreadStart9 kernelbase.dll TerminateProcessOnMemoryExhaustion```  
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: mNextWordIndex=0
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: returning: yuv (skip=0)
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: got word "yuv"
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: removing ranges for some interval.
[Child 20327: Main Thread]: D/InlineSpellChecker RemoveRange
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: mNextWordIndex=1
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: returning: gfx (skip=0)
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: got word "gfx"
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: removing ranges for some interval.
[Child 20327: Main Thread]: D/InlineSpellChecker RemoveRange
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: mNextWordIndex=2
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: returning: wr (skip=0)
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: got word "wr"
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: removing ranges for some interval.
[Child 20327: Main Thread]: D/InlineSpellChecker RemoveRange
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: mNextWordIndex=3
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: returning: swgl (skip=0)
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: got word "swgl"
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: removing ranges for some interval.
[Child 20327: Main Thread]: D/InlineSpellChecker RemoveRange
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: mNextWordIndex=4
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: returning: src (skip=0)
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: got word "src"
[Child 20327: Main Thread]: D/InlineSpellChecker DoSpellCheck: removing ranges for some interval.
[Child 20327: Main Thread]: D/InlineSpellChecker RemoveRange
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: mNextWordIndex=5
[Child 20327: Main Thread]: D/InlineSpellChecker GetNextWord: returning: composite (skip=0)
[Child 20327: Main Thread]: V/InlineSpellChecker DoSpellCheck: we have run out of time, schedule next round.

As above log shows, when resuming the check, it starts at the first word again. This seems to lead to the flickering.

I don't know why:
1. only so few checks are performed during one time-slice.
2. when resuming the check, not the next word is chosen.

To fix this issue, both questions deserve investigation.

This seems related to bug 1502661.

Reproducible with current Firefox beta too. Will bisect it there.

Edit: unfortunately not reproducible when using mozregression-gui.

See Also: → 1354641
See Also: → 1362858

What's described in c12 seems similar to what's described in bug 1362858#c11.

Attached file x2.html

More concise example.

I couldn't minimize the example, but a necessary factor seems to be: misspelled words (e.g. "gfx", "wr", "swgl", "src") separated by slashes.

The issue doesn't always occur, but most of the times when alt-tabbing back to the opened example.

Despite being 15 years old, bug 345112#c1 describes some of the foundational ideas of the spelling checker, which could still be relevant for this issue.

Current guess: the bug is around https://searchfox.org/mozilla-central/rev/f07a609a76136ef779c65185165ff5ac513cc172/extensions/spellcheck/src/mozInlineSpellChecker.cpp#1301.

The root-cause is that when resuming the check via mozInlineSpellChecker::DoSpellCheck, the "soft text" is rebuilt.
This leads to restarting the spellchecking at the previous DOM word separator.

Edit: not restarting the spellchecking in case it's resumed could work. I guess it would have to be ensured that the DOM didn't change meanwhile. That is, restart only if not resumed because of observing any DOM events.

Depends on: 1700051
See Also: → 1699403

It's suspicious that eOpResume is never used. This seems unintended.

Fixing this requires are clearer understanding of mozInlineSpellChecker::ResumeCheck, mozInlineSpellStatus and mozInlineSpellResume.

Depends on: 1706613

Removing the call to mSoftText.Invalidate() doesn't fix the issue, because a new mozInlineSpellWordUtil instance is created every time mozInlineSpellChecker::Resume is called. That one can be invalid already when SetPositionAndEnd is called.

Depends on: 1706894

For the record: the most promising, pragmatic way forward seems to be to understand why using the whole idle period time (bug 1699403) apparently doesn't suffice to mitigate the issue.

Relevant profile: https://share.firefox.dev/3tZv0Dh.

It indicates removing ranges is slow. It's because calls of nsTextFrame::SelectionStateChanged are expensive.

nsTextFrame::SelectionStateChanged(unsigned int, unsigned int, bool, mozilla::SelectionType)
mozilla::dom::Selection::SelectFrames(nsPresContext*, nsRange*, bool) const
mozilla::dom::Selection::RemoveRangeAndUnselectFramesAndNotifyListeners(nsRange&, mozilla::ErrorResult&)
mozInlineSpellChecker::RemoveRange(mozilla::dom::Selection*, nsRange*)

This issue occurs, when the spellchecker runs out of time when checking "foo/bar/x.html". When "bar" is the last checked word, in the next time slice, it starts checking at "foo" again, removing the existing ranges for "foo" and "bar" and again adding them. Jumping back to "foo" seems necessary, because the spellchecker treats "/" differently than a blank, because the former might appear as part of a URL.

The right fix here seems to be: when jumping back, don't remove the existing ranges, but only remove them if they wouldn't be added immediately afterwards. In practice, there should always be time slices when the text doesn't change, resulting in no removing and adding of existing ranges. Therefore always proceeding in the words to check, hence not endlessly scheduling new spellcheckings.

Depends on: 1708422

I see flickering while filing a (long) GitHub issue today.

This might be related to the very odd behavior I see with spell checking temporarily underlining all the wrong words I reported in bug 1704786.

(In reply to Brian Birtles (:birtles) from comment #23)

Created attachment 9219412 [details]
Flickering while filing GitHub issue

I see flickering while filing a (long) GitHub issue today.

This might be related to the very odd behavior I see with spell checking temporarily underlining all the wrong words I reported in bug 1704786.

Thanks for mentioning that other bug, however, I guess it's not directly related to this issue.

The fix of bug 1687263 will require determining which nsRanges of a
previous time slice to keep and for which NodeOffsetRanges to create
new nsRanges. The operator will be used in a following review.

Depends on D116093

This might lead to increased exceeding of the time slice, because the
ranges are removed after the time limit was checked. Therefore, this
change should only be landed together with the following reviews which
will remove unnecessary remove- and add-operations for ranges. It's a
separate review only to simplify reviewing and potential debugging.

Depends on D116094

Removing ranges is expensive. This reduces flickering of the spelling
mistakes and prevents 100% CPU usage of 1 core on slower machines.

The essence of this patch is, that when text doesn't change, all
existing spellchecking ranges are reused.

Before this patch, removing ranges was done as part of the time slice in
mozInlineSpellChecker::SpellCheckSlice. That is, slow removals of
ranges contributed to the amount of words to be spellchecked
asynchronously.
Therefore, the amount of words to be spellchecked in one chunk could
increase from the minimum, INLINESPELL_MINIMUM_WORDS_BEFORE_TIMEOUT to
the maxium INLINESPELL_MAXIMUM_CHUNKED_WORDS_PER_TASK.
Consequently, the asynchronous checking might take longer. If that turns
out to be problematic, reducing
INLINESPELL_MAXIMUM_CHUNKED_WORDS_PER_TASK could be a fix.

Depends on D116095

Attachment #9223740 - Attachment description: WIP: Bug 1687263: part 1) Add `NodeOffset::operator==(const RangeBoundary)` → Bug 1687263: part 1) Add `NodeOffset::operator==(const RangeBoundary)`. r=smaug
Attachment #9223741 - Attachment description: WIP: Bug 1687263: part 2) Add `NodeOffsetRange::operator==(const nsRange& aRange)` → Bug 1687263: part 2) Add `NodeOffsetRange::operator==(const nsRange& aRange)`. r=smaug
Attachment #9223742 - Attachment description: WIP: Bug 1687263: part 3) Move removing ranges into `CheckWordsAndUpdateRangesForMisspellings` → Bug 1687263: part 3) Move removing ranges into `CheckWordsAndUpdateRangesForMisspellings`. r=smaug
Attachment #9223743 - Attachment description: WIP: Bug 1687263: part 4) Defer and in some cases avoid removing spellchecking-ranges → Bug 1687263: part 4) Defer and in some cases avoid removing spellchecking-ranges. r=smaug
Pushed by mbrodesser@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/904c7216cff6
part 1) Add `NodeOffset::operator==(const RangeBoundary)`. r=smaug
https://hg.mozilla.org/integration/autoland/rev/27f027e4b2f3
part 2) Add `NodeOffsetRange::operator==(const nsRange& aRange)`. r=smaug
https://hg.mozilla.org/integration/autoland/rev/e51c8fa04be5
part 3) Move removing ranges into `CheckWordsAndUpdateRangesForMisspellings`. r=smaug
https://hg.mozilla.org/integration/autoland/rev/36c7ce4d1747
part 4) Defer and in some cases avoid removing spellchecking-ranges. r=smaug
Flags: qe-verify+

Reproduced using the attachment from comment 15 with 88.0a1 (2021-03-18) on Windows 10.
Verified fixed with 91.0a1 ((2021-06-03) and 90.0b3 on Windows 10 and Ubuntu 18.04.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.