Hangs on pages with huge (many thousands) number of links, mostly due to building large IPC messages and HashSets
Categories
(Toolkit :: Places, defect, P3)
Tracking
()
Performance Impact | low |
People
(Reporter: mayankleoboy1, Unassigned, NeedInfo)
References
()
Details
(Keywords: perf, Whiteboard: [sng-scrubbed])
Attachments
(2 files)
See profile : https://share.firefox.dev/3Rtd9mJ , https://share.firefox.dev/3ZxdfLV
I was doing some testing on a demo-type page.
Go to https://random-items.erosson.org/
Enter 1000000 as the number.
Click on roll
I get this 42 second hang (once the js stuff is over and the content is generated) . The browser froze, and I got the "Mouse cursor is spinning as the program is not responding" error message on Windows.
My guess is that because the demo page generates a lot of URLs, the browser takes a long time to check if these URLs have been visited before.
Reporter | ||
Comment 1•2 years ago
|
||
Reporter | ||
Comment 2•2 years ago
•
|
||
Actually, the parent process jank also occurs for smaller values of inputs.
Enter 100000
https://share.firefox.dev/3LAPVY2
Reporter | ||
Comment 3•2 years ago
•
|
||
And a profile with all threads and File I/O : https://share.firefox.dev/463GBE4
Shows some File I/O activity on on sqlitedb:places:sqlite
Comment 4•2 years ago
|
||
Afaict this seems due to a large number of runnables dispatched across threads.
Comment 5•2 years ago
|
||
(In reply to Marco Bonardo [:mak] from comment #4)
Afaict this seems due to a large number of runnables dispatched across threads.
In particular, I see Document::FlushPendingLinkUpdates doing a lot of work, and if we look at a diagram of calls to BaseHistory::ScheduleVisitedQuery that's on the places stack we can see that FlushPendingLinkUpdates can be causally responsible for that.
It seems like DOM and/or Places should handle this better, if only by entering an overload mode that stops doing link visited requests for a page.
Given the amount of time spent in layout, I wonder if part of the problem could be massive churn in the DOM such that a VDOM implementation is not providing continuity to DOM elements and so there might be an additional multiplicative factor.
Comment 6•2 years ago
|
||
This will probably need input from Emilio and the DOM team.
Updated•2 years ago
|
Updated•2 years ago
|
Comment 7•2 years ago
|
||
cannot reproduce on linux, 118. (granted 64GB ram). But similar configs show on mac.
Updated•2 years ago
|
Reporter | ||
Comment 8•1 year ago
•
|
||
I am able to reproduce similar hang on an hg.m.o URL as well.
URL to repro: https://hg.mozilla.org/try/rev/ec8f980f95c16f401eb49f12d315d4795190d452
Corresponding profile: https://share.firefox.dev/47qE8oc
Comment 9•1 year ago
|
||
(In reply to Mayank Bansal from comment #8)
I am able to reproduce similar hang on an hg.m.o URL as well.
URL to repro: https://hg.mozilla.org/try/rev/ec8f980f95c16f401eb49f12d315d4795190d452
Corresponding profile: https://share.firefox.dev/47qE8oc
This shows a 20s hang but almost all the samples are in profiler code itself. Apparently https://share.firefox.dev/3QMQJe7 has profiler-only code that tracks mDispatchTimes
but calling Count()
appears to be slow. Though I also don't understand why there are only ~1400 samples when the sample interval is 1ms and there's a 23 second hang. Florian, who could look into this / is there some way to get a more useful profile out of this?
(Reporter, I assume the hang happens even if you disable the profiler?)
Reporter | ||
Comment 10•1 year ago
•
|
||
(In reply to :Gijs (he/him) from comment #9)
(Reporter, I assume the hang happens even if you disable the profiler?)
Now that I try again, with the profiler disabled, the hang is much less compared to when the profiler is enabled.
Though I also don't understand why there are only ~1400 samples when the sample interval is 1ms and there's a 23 second hang
AFAIK, on Windows the profiler is actually sampling at 2ms frequency when it is set at 1ms in the Profiler UI. So the number of samples are half of actual time spent.
Another profile if the previous one was not useful : https://share.firefox.dev/47DeZ9g
Comment 11•1 year ago
|
||
(In reply to :Gijs (he/him) from comment #9)
This shows a 20s hang but almost all the samples are in profiler code itself.
This is the event delay code that have been planning to remove for a while. Moving the needinfo to Markus to clarify what the next steps are to go ahead and remove this code.
Though I also don't understand why there are only ~1400 samples when the sample interval is 1ms and there's a 23 second hang.
On Windows the default timer resolution is 1/64s (ie 15.6ms). If we request high precision timers (which the profiler does, when the requested sampling interval is smaller than 10ms), we can sample up to every 2ms. In https://share.firefox.dev/3QSkudC we have high resolution timers for the profiler sampler until 30.18s, and after that it goes back to low resolution timers. It's not the first time I see this happening in a profile, but I don't have an explanation for why it happens. Maybe some other code does mismatched timeBeginPeriod/timeEndPeriod calls.
Updated•1 year ago
|
Reporter | ||
Comment 12•9 months ago
|
||
The link no longer loads.
Reporter | ||
Comment 13•9 months ago
|
||
This testcase will reproduce the hang. The duration of hang increases with the increase in the loopcount.
Reporter | ||
Comment 14•9 months ago
•
|
||
Profile with the attached testcase: https://share.firefox.dev/3yfFB47
Edit: On second look, the profile in parent-process looks to be heavily skewed due to the profiler issue described above.
Reporter | ||
Comment 15•8 months ago
|
||
An alternate msthod would be to go to : https://wolcendb.erosson.org/loot?tier=0
Let it load once, then close the tab. Then go to the link again.
Profile: https://share.firefox.dev/3MzVoyt
Reporter | ||
Updated•8 months ago
|
Reporter | ||
Comment 16•6 months ago
•
|
||
Massive improvements.
- Profile from latest Nightly (containing the fix from bug 1594368)
- Profile of the attached testcase: https://share.firefox.dev/3YHsWBC
- Profile of opening the URL in comment #15: https://share.firefox.dev/3YnQTfK
- Profile of opening the hg.mo link from comment #8: https://share.firefox.dev/3YjuTCK
Comment 17•6 months ago
|
||
Based on a profile, I think the remaining cost is mostly about preparing the IPC messages.
I wonder if we could in the return message just include URIs whose state changed, to build a much smaller one.
That said, I think the most compelling part is done.
Leaving this open to evaluate shrinking the IPC data exchange in the future.
Description
•