Gloda is getting in the way of typing messages, slow because of jank from synchronous GC
Categories
(MailNews Core :: Database, defect)
Tracking
(thunderbird_esr68 affected, thunderbird75 wontfix)
People
(Reporter: florian, Assigned: mkmelin)
References
Details
(Keywords: perf)
Attachments
(1 file, 1 obsolete file)
9.30 KB,
patch
|
florian
:
review+
|
Details | Diff | Splinter Review |
Here is a profile I just captured: https://perfht.ml/30l4oxv
I was attempting to type IRC messages and the characters weren't appearing at the rate at which I was typing them.
The profile shows at least:
- 378ms spent in do_createDocumentEncoder. Bug 1535133 covers this already.
- gloda triggers a forced GC at https://searchfox.org/comm-central/rev/d55fb7f2405fd4e07cb267851340626a63a8c270/mailnews/db/gloda/modules/utils.js#144 causing a 667ms synchronous GC (and I strongly suspect this triggers the 500+ms janky cycle collector activity a little later.
Comment 1•5 years ago
|
||
I've also noticed this - can be nasty at times. And your profile shows a total of almost 1 second of jank.
Is Bug 1535133 nearing completion?
Reporter | ||
Comment 2•5 years ago
|
||
(In reply to Florian Quèze [:florian] (PTO until September 2nd) from comment #0)
- gloda triggers a forced GC at https://searchfox.org/comm-central/rev/d55fb7f2405fd4e07cb267851340626a63a8c270/mailnews/db/gloda/modules/utils.js#144 causing a 667ms synchronous GC (and I strongly suspect this triggers the 500+ms janky cycle collector activity a little later.
I saw this again, but this time I wouldn't say it's just jank, gloda's forced GC caused a 27.7s hang, followed by 9.2s of cycle collection: https://perfht.ml/30SRh6W And this was followed by another slow GC just a few seconds later: https://perfht.ml/30RTRdt
My Thunderbird instance isn't using a crazy large amount of memory (about 2GB). It's larger than I would like, but smaller than other Firefox processes that remain responsive while GC'ing.
Comment 3•5 years ago
|
||
I saw this again, but this time I wouldn't say it's just jank, gloda's forced GC caused a 27.7s hang, followed by 9.2s of cycle collection: https://perfht.ml/30SRh6W And this was followed by another slow GC just a few seconds later: https://perfht.ml/30RTRdt
That's pretty bad. Any indicatation this behavior new to version 68? What is next step?
Assignee | ||
Comment 4•5 years ago
|
||
Maybe we should follow up with a Cu.forceCC(); too after Cu.forceGC(); Perhaps Cu.forceShrinkingGC(); too?
Seems like the usual pattern is to call Cu.forceGC(); Cu.forceCC(); (and sometimes Cu.forceShrinkingGC();)
Reporter | ||
Comment 5•5 years ago
|
||
Except when doing automated tests related to memory use or that may fail if there's a leak, we should never force a GC. The current behavior blocks the UI for long periods of time; not forcing the GC would let the GC do its work in smaller chunks and shouldn't block the UI much (if at all) unless we have a significant memory leak.
Comment 6•5 years ago
|
||
FWIW, GC code is constantly improving, if slowly - https://mzl.la/2DQpaeR
Reporter | ||
Comment 7•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #6)
FWIW, GC code is constantly improving, if slowly - https://mzl.la/2DQpaeR
This sounds like you agree that a hack introduced in 2009 to workaround a GC bug is probably no longer needed, right?
Reporter | ||
Comment 8•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #3)
Any indicatation this behavior new to version 68?
It's not a recent regression, no.
What is next step?
Someone deciding if it's fine to entirely remove the forced garbage collections in Gloda, or if we need to invest time into making the hack not suck so much for chat messages. The hack is meant to force a GC every 4096 indexed emails, but is also forcing a GC whenever indexing finishes. When using chat, each new chat message triggers a need for re-indexing the current conversation, which happens quickly, and finishes, causing the forced GC to potentially happen (in the worse case) for every chat message exchanged.
Comment 9•5 years ago
|
||
(In reply to Florian Quèze [:florian] from comment #7)
(In reply to Wayne Mery (:wsmwk) from comment #6)
FWIW, GC code is constantly improving, if slowly - https://mzl.la/2DQpaeR
This sounds like you agree that a hack introduced in 2009 to workaround a GC bug is probably no longer needed, right?
Note, the kludge is discussed in bug 465618 comment 23, and implemented in that bug. And 2 months prior a performance patch landed in Bug 465353 - general sluggish, lack of responsiveness during indexing of gloda.
"Probably" may be too strong a word because skepticism is warranted - Core folks don't explicitly benchmark our usecase. I'd use the word hope, because GC is surely better on average today than it was 12 years ago. The theory of whether it is "better enough" can only be tested by removing the code and running some benchmarks (do we still have them?) and having some users test.
Magnus, which do you want to go for first - your comment 4, removing the kludge done in bug 465618, or bug 1535133?
Assignee | ||
Comment 10•5 years ago
|
||
Let's remove the manual gc calls and hope other work has made the calls unnecessary.
Florian, do you want to review?
Reporter | ||
Comment 11•5 years ago
|
||
Assignee | ||
Comment 12•5 years ago
|
||
Thanks, removed the last caller.
Reporter | ||
Comment 13•5 years ago
|
||
Comment 14•5 years ago
|
||
Great progress. This is a major backend change, so we should not uplift to beta but rather let it ride the train for longer testing with fewer users.
Perhaps have some chat and regular mail user specifically test a nightly, both with existing and new profile (to download and index lots of mail), to see to what extent it helps/hurts bug 1535133 and "normal usage".
Assignee | ||
Updated•5 years ago
|
Comment 15•5 years ago
|
||
Pushed by thunderbird@calypsoblue.org:
https://hg.mozilla.org/comm-central/rev/58f8cf2d8089
remove gloda manual calls to garbage collection, which were causing lockup a times. r=florian
Updated•5 years ago
|
Comment 16•5 years ago
|
||
Description
•