Closed Bug 1363176 Opened 4 years ago Closed 4 years ago

Possible infinite spell checking scheduling loop regression

Categories

(Core :: Spelling checker, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox-esr52 --- unaffected
firefox53 --- unaffected
firefox54 --- unaffected
firefox55 blocking fixed

People

(Reporter: ehsan.akhgari, Assigned: jj.evelyn)

References

Details

(Keywords: regression)

See this profile: https://perfht.ml/2qKQ6I3

Trevor hit this today, possibly on a Facebook tab from the best we could gather from his browser running.  It's hard to know what's going on here but this seems to me like an infinite scheduling loop regression that may be due to bug 1354641.  I'm specifically concerned about the changes about detecting overlapping words there.

Evelyn, can you please have a look?
okay, ni myself for tracking.
Flags: needinfo?(ehung)
I talked to Ehsan in person, and here is more information:

1. They tried to gdb and break in the spell check function, and then see it was called repeatedly. That's where the "infinite loop" speculation came from. It's also possible that in each specll check round we don't have any word to check, so (from the profile) the time was all spending on EnsureWords() instead of sending IPC for word checking.

2. At the time of problem happening, the CPU usage was in 100% high.

This issue looks similar to the test case failure in bug 1361820. I will try to reproduce the problem there and see if it has the same symptoms as this one.
See Also: → 1363391
Duplicate of this bug: 1363580
I remember that the chat window was open in the Facebook tab on Trevor's browser!  That must be the key to reproducing this!
Yes, indeed.  Here is the STR:

1. Load Facebook.
2. Click on a contact to open the chat window.
3. The CPU usage goes to 100%.

Should be simple to debug in rr.  :-)
Duplicate of this bug: 1363391
Yes, I can reproduce on Mac too. I'm debugging. Thanks.
Seems like a severe regression, marking as blocker for 55.
I ran into this bug too or a variation of it. See Bug 1363559. In my case, on Facebook, I don't need to have the chat window open. The mere presence of a facebook tab spikes the CPU usage from ~6% to 40% on my win10 laptop.
As far as I know, the problem seems to be the wordUtil used by mozInlineSpellCheck calculates the node and the position wrong when we call SetEnd and SetPosition. The 1ms timeout makes it not proceeding too far and then it falls to a trap that keeps moving its mSoftBegin/mSoftEnd back to the same node and position. I guess this will also happen when the timeout was 50ms but in a lower probability.

I need more time to look into the logic of wordUtil, so let's firstly backout my patch on bug 1354641 so people can use Nightly normally.
Flags: needinfo?(ehung)
Bug 1354641 has been backed out from mozilla-central. It will be in the next nightlies from there.
Assignee: nobody → ehung
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Duplicate of this bug: 1362256
On similar bug 1362858 I noticed this bug triggering on inline script inside a <script> tag.  Do we expect to spellcheck the contents of script tags?  If so, then that bug should be duped to this one, if not, then maybe it should be changed to be just about spellchecking scripts?
(In reply to Andrew Sutherland [:asuth] from comment #14)
> On similar bug 1362858 I noticed this bug triggering on inline script inside
> a <script> tag.  Do we expect to spellcheck the contents of script tags?  If
> so, then that bug should be duped to this one, if not, then maybe it should
> be changed to be just about spellchecking scripts?

IMO it shouldn't go into checking scripts. I've commented on bug 1362858 and let's use that bug for fixing the problem. Thanks!
See Also: → 1362858
Duplicate of this bug: 1362486
You need to log in before you can comment on or make changes to this bug.