Facebook spend most of its time under `get PerformanceResourceTiming` functions.
Categories
(Core :: Security, defect, P2)
Tracking
()
People
(Reporter: nbp, Unassigned)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
Attachments
(3 obsolete files)
While investigating Bug 1674295, which can be reproduced by:
- Create an Ubuntu Server 20.10 VM
- install lightdm, i3, and Firefox Nightly from mozilla website.
- Start firefox.
- Login on Facebook.
- Scroll far down.
- Check about:performance to see a high CPU impact, with a slow memory increase.
I was able to collect the following profile: https://share.firefox.dev/2MZ85qQ
This profile mostly report what seems to be an infinite loop calling get PerformanceRessourceTiming.responseEnd
and get PerformanceRessourceTiming.requestStart
which seems to be taking a few milli-seconds each (likely emphasized by some gecko-profiler instrumentation)
This investigation is in the context of a slow and steady memory leak (Bug 1674295).
Updated•4 years ago
|
Comment 1•4 years ago
|
||
This is basically https://bugzilla.mozilla.org/show_bug.cgi?id=1655683 I think.
Updated•4 years ago
|
Updated•4 years ago
|
Comment 2•4 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #0)
I was able to collect the following profile: https://share.firefox.dev/2MZ85qQ
This profile mostly report what seems to be an infinite loop calling
get PerformanceRessourceTiming.responseEnd
andget PerformanceRessourceTiming.requestStart
which seems to be taking a few milli-seconds each (likely emphasized by some gecko-profiler instrumentation)
Most of the time there is really in overhead of the native allocation recording from the profiler. https://share.firefox.dev/35Hzjc2 shows what remains (705 samples out of 3355 samples that call PerformanceRessourceTiming.*) when excluding the profiler functions related to allocation tracking.
Reporter | ||
Comment 3•4 years ago
|
||
The same happens when I disabled the native allocation reporting from the profiler, where the function takes 1 ms or more each:
https://share.firefox.dev/3nOecv1
Comment 4•4 years ago
|
||
I've seen a similar report internally triggered by the same code. We've experimenting with reducing the cap of performance entries which should help. How many performance entries do you have when this problem triggers? performance.getEntries().length
Comment 5•4 years ago
|
||
The code in question does performance.getEntriesByName(url, 'resource').reduce(...)
and it looks for the max entry by responseEnd that matches a url. getEntriesByName isn't spec'ed to be sorted by time so an exhaustive linear search is correct. Looks like it's used by the video and/or live streaming code.
It looks like work is done lazy on every access of these fields. Is that an appropriate implementation?
Updated•4 years ago
|
Comment 6•4 years ago
•
|
||
The implementation isn't that unreasonable when the default buffer size is used.
One could also keep a separate lists of entries in a hashtable which is keyed on the name, and then the filtering would happen on type only,
or vice versa. But there would still be O(n) behavior.
Caching the most recent result and possibly explicitly modifying that if there are new entries matching the filters it use might work in some cases, but not in general.
Comment 7•4 years ago
|
||
The entries should definitely use atoms for name and entryType. That would speed up things already quite a bit.
Comment 8•4 years ago
|
||
But somehow profile in comment 3 enters the very slowest paths of bug 1655683.
Reporter | ||
Comment 9•4 years ago
|
||
(In reply to Benoit Girard (:BenWa) from comment #4)
I've seen a similar report internally triggered by the same code. We've experimenting with reducing the cap of performance entries which should help. How many performance entries do you have when this problem triggers?
performance.getEntries().length
Looking at the profiles and searching for getEntries
report all the getEntriesByName
calls, and the stack chart view gives us some temporal hint of the number of calls:
Comment 10•4 years ago
|
||
I can't reproduce the responseEnd loop locally, including scrolling past/stopping on videos. It sounds like if they are looking at a bunch of timestamps of different times in sequence, then we will be thrashing the LRU cache of jitter seeds which are SHA-256 hashes. It will cause us to do a bunch of SHA-256 hashes (and memory allocations).
The comment here is the same set of steps I'd recommend to pinpoint this if one can reproduce it.
To resolve it, depending on the cause, we could remove the lock, replace SHA-256 with something faster but less secure, or look at disabling jitter entirely.
Reporter | ||
Comment 11•4 years ago
|
||
(In reply to Tom Ritter [:tjr] (ni? for response to sec-[advisories/bounties/ratings/cves]) from comment #10)
The comment here is the same set of steps I'd recommend to pinpoint this if one can reproduce it.
I will test in an hour.
Reporter | ||
Comment 12•4 years ago
|
||
(In reply to Tom Ritter [:tjr] (ni? for response to sec-[advisories/bounties/ratings/cves]) from comment #10)
I can't reproduce the responseEnd loop locally, including scrolling past/stopping on videos. It sounds like if they are looking at a bunch of timestamps of different times in sequence, then we will be thrashing the LRU cache of jitter seeds which are SHA-256 hashes. It will cause us to do a bunch of SHA-256 hashes (and memory allocations).
The comment here is the same set of steps I'd recommend to pinpoint this if one can reproduce it.
Testing with:
privacy.resistFingerprinting.reduceTimerPrecision.jitter
set to false (was set to true before)privacy.reduceTimerPrecision
set to true
does not reproduce the excessive time under the get PerformanceResourceTiming
functions, it in fact disappear from the profiles:
https://share.firefox.dev/3sIrrB7
The loop remains and leaks even faster now (Bug 1674295)
Comment 13•4 years ago
•
|
||
(In reply to Benoit Girard (:BenWa) from comment #5)
The code in question does
performance.getEntriesByName(url, 'resource').reduce(...)
and it looks for the max entry by responseEnd that matches a url.
Is it possible that the reduce
callback function is accessing responseEnd even for entries which don't match the URL?
Comment 14•4 years ago
|
||
Looking at the network chart in the profiles, it seems that every stream chunk has a different URL. And yet, responseEnd seems to be called many times per chunk.
Comment 15•4 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #13)
Is it possible that the
reduce
callback function is accessing responseEnd even for entries which don't match the URL?
Oh, nevermind, that would only be possible if getEntriesByName
returned entries that don't match the URL.
It think we need to find out how many calls we get per network request, by adding logging or profiler markers in getEntriesByName (log URL + number of results) and in GetResponseEnd.
Comment 16•4 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #9)
Looking at the profiles and searching for
getEntries
report all thegetEntriesByName
calls, and the stack chart view gives us some temporal hint of the number of calls:
I'm not exactly sure what you're describing here, but it seems like you're trying to get the number of calls, which is impossible to deduce from sampling-based data.
Comment 17•4 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #8)
But somehow profile in comment 3 enters the very slowest paths of bug 1655683.
Would it make sense to reduce the timer precision only once per entry? At the moment we do it on every field access:
https://searchfox.org/mozilla-central/rev/d5e98892f9553f9113e69c585308008e681e19c9/dom/performance/PerformanceTiming.cpp#555
Maybe we should do it during construction, and store the DOMHighResTimeStamp
instead of the TimeStamp
on PerformanceTimingData
.
Comment 18•4 years ago
•
|
||
That sounds very reasonable. And if we need the higher accuracy values for something, store also them but don't expose to the web.
But isn't some of that already happening, like https://searchfox.org/mozilla-central/rev/d5e98892f9553f9113e69c585308008e681e19c9/dom/performance/PerformanceResourceTiming.cpp#43 I guess we need it in more places. Cache, or store only DOMHighResTimeStamp
Reporter | ||
Comment 19•4 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #15)
(In reply to Markus Stange [:mstange] from comment #13)
Is it possible that the
reduce
callback function is accessing responseEnd even for entries which don't match the URL?Oh, nevermind, that would only be possible if
getEntriesByName
returned entries that don't match the URL.It think we need to find out how many calls we get per network request, by adding logging or profiler markers in getEntriesByName (log URL + number of results) and in GetResponseEnd.
The VM on which I am testing is not capable of building Firefox, and I suspect that installing mozilla-central build dependencies would make the bug disappear (by installing some missing libraries).
I can try it if you make a try-push that I can download, or if you manage to give me a BPF file which has the uprobe instrumentation (if someone can tell me how to unstrip firefox binaries from the crash symbols…).
(In reply to Markus Stange [:mstange] from comment #16)
(In reply to Nicolas B. Pierron [:nbp] from comment #9)
Looking at the profiles and searching for
getEntries
report all thegetEntriesByName
calls, and the stack chart view gives us some temporal hint of the number of calls:I'm not exactly sure what you're describing here, but it seems like you're trying to get the number of calls, which is impossible to deduce from sampling-based data.
Yes, this way of counting only gives us a lower bound estimate of the number of calls, knowing that we have no coroutine in Firefox, and if these are not generators/await functions (are they?).
Comment 20•4 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #19)
I can try it if you make a try-push that I can download
I've started a try build here: https://treeherder.mozilla.org/jobs?repo=try&revision=744e02d1bd61dd24b283d9a1b3ddba383dfef7aa
Yes, this way of counting only gives us a lower bound estimate of the number of calls
Oh, I see. Yes, it does let you get a lower bound.
Comment 21•4 years ago
•
|
||
Oh, just for reference, here's the "prettified" code from the facebook bundle:
function getLatestResourceTimingEntry(name) {
if (!performance.getEntriesByName) return null;
var entries = performance.getEntriesByName(name, "resource");
return entries.reduce(function (a, b) {
if (a == null) return b;
return typeof b.responseEnd === "number" &&
typeof b.responseStart === "number" &&
typeof b.startTime === "number" &&
typeof a.responseEnd === "number" &&
typeof a.responseStart === "number" &&
typeof a.requestStart === "number" &&
typeof a.startTime === "number"
? b.responseEnd < a.responseEnd
? a
: b
: null;
}, null);
}
Comment 22•4 years ago
•
|
||
(In reply to Markus Stange [:mstange] from comment #20)
I've started a try build here: https://treeherder.mozilla.org/jobs?repo=try&revision=744e02d1bd61dd24b283d9a1b3ddba383dfef7aa
Nicolas, can you reproduce the problem again in this try build and grab a profile?
Comment 23•4 years ago
|
||
I've written an experimental patch to do the timestamp resolution work eagerly, for all fields, even if none of the performance APIs are called. I'm not sure if that's a good idea but I'll put the patches up for reference.
Comment 24•4 years ago
|
||
Comment 25•4 years ago
|
||
Depends on D103006
Comment 26•4 years ago
|
||
Depends on D103007
Comment 27•4 years ago
|
||
Reporter | ||
Comment 28•4 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #22)
(In reply to Markus Stange [:mstange] from comment #20)
I've started a try build here: https://treeherder.mozilla.org/jobs?repo=try&revision=744e02d1bd61dd24b283d9a1b3ddba383dfef7aa
Nicolas, can you reproduce the problem again in this try build and grab a profile?
The profile made with this try build is available at https://share.firefox.dev/3iUcQOn
Comment 29•4 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #28)
The profile made with this try build is available at https://share.firefox.dev/3iUcQOn
Thanks! In this profile we're spending much less time in Facebook's getLatestResourceTimingEntry
function, probably because Facebook has reduced the size of the resource timing buffer in the meantime.
You can see lots of resourcetimingbufferfull
events being fired: https://share.firefox.dev/2KVVWm4
Moreover, we now have between 25 and 30 entries for each URL in the buffer. That's still a lot! But likely a lot less than in comment 3.
It indeed looks like Facebook requests the same URLs over and over again. Here's the network chart: https://share.firefox.dev/2NDSfCt
On the unfiltered marker chart (https://share.firefox.dev/2NDU0zz), you can execute the following code in the web console to get a breakdown by URL:
let loadUrlList = filteredMarkers.filter(m => m.name.startsWith("Load ")).map(m => m.data.URI);
let getEntryUrlList = filteredMarkers.filter(m => m.name === "GetEntriesByName").map(m => m.data.name);
let urlSet = new Set(loadUrlList.concat(getEntryUrlList));
[...urlSet.values()].map(url => `Loaded ${loadUrlList.filter(u => u == url).length} times, getEntriesByName ${getEntryUrlList.filter(u => u == url).length} times: ${url}`).join("\n")
This gives:
Loaded 341 times, getEntriesByName 682 times: https://video-cdt1-1.xx.fbcdn.net/v/t42.1790-2/138786069_210069640784604_7754146941036574614_n.mp4?_nc_cat=110&ccb=2&_nc_sid=5aebc0&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfYXVkaW9fYWFjcF82NF9mcmFnXzJfYXVkaW8ifQ%3D%3D&_nc_ohc=sUc_JHGqWScAX_oLvac&_nc_ht=video-cdt1-1.xx&oh=e184e3e87c1cec7665ad84ec0452853a&oe=60101148&bytestart=82483&byteend=83654
Loaded 282 times, getEntriesByName 846 times: https://video-cdg2-1.xx.fbcdn.net/v/t39.25447-2/140846455_245865773926867_8619276549445379178_n.mp4?_nc_cat=108&ccb=2&_nc_sid=5aebc0&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjRfNXNlY2dvcF9ocTVfZnJhZ18yX3ZpZGVvIn0%3D&_nc_ohc=HOBUcEQlJwwAX-X_zWb&_nc_ht=video-cdg2-1.xx&oh=05ccc735d463fdef51962f97f5a3262d&oe=60347A0A&bytestart=477058&byteend=675982
Loaded 341 times, getEntriesByName 680 times: https://video-cdg2-1.xx.fbcdn.net/v/t42.1790-2/140721819_1063062194157692_801491828212844150_n.mp4?_nc_cat=1&ccb=2&_nc_sid=5aebc0&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfYXVkaW9fYWFjcF82NF9mcmFnXzJfYXVkaW8ifQ%3D%3D&_nc_ohc=Yk5oKPMOKwMAX94i15y&_nc_ht=video-cdg2-1.xx&oh=32b7b424a728186b3ef412c068fbbb55&oe=60100FAA&bytestart=50108&byteend=51263
Loaded 282 times, getEntriesByName 846 times: https://video-cdt1-1.xx.fbcdn.net/v/t39.25447-2/143164686_1945404802274609_6497816041632441805_n.mp4?_nc_cat=101&ccb=2&_nc_sid=5aebc0&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjRfNXNlY2dvcF9ocTFfZnJhZ18yX3ZpZGVvIn0%3D&_nc_ohc=WvzbpuK4vX4AX_TLJ9Y&_nc_ht=video-cdt1-1.xx&oh=d673838b57bb1826b3e59b92d71c8716&oe=6036A61A&bytestart=451118&byteend=555248
Loaded 340 times, getEntriesByName 680 times: https://video-cdg2-1.xx.fbcdn.net/v/t42.1790-2/132891872_3396079503847270_6185489711801712447_n.mp4?_nc_cat=1&ccb=2&_nc_sid=5aebc0&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfYXVkaW9fYWFjcF82NF9mcmFnXzJfYXVkaW8ifQ%3D%3D&_nc_ohc=gN7uBOoJykUAX8T0Odn&_nc_ht=video-cdg2-1.xx&oh=d6ac64db567c09da5a0ba39dafc9a770&oe=6010180F&bytestart=66485&byteend=75767
Loaded 282 times, getEntriesByName 846 times: https://video-cdg2-1.xx.fbcdn.net/v/t39.25447-2/135241769_402607857487670_748316234532690871_n.mp4?_nc_cat=104&ccb=2&_nc_sid=5aebc0&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjRfNXNlY2dvcF9ocTFfZnJhZ18yX3ZpZGVvIn0%3D&_nc_ohc=4m83SaqjXCMAX_-k8e1&_nc_ht=video-cdg2-1.xx&oh=321cc688df865419acf0957846280296&oe=6033D33F&bytestart=251851&byteend=320829
Loaded 340 times, getEntriesByName 680 times: https://video-cdg2-1.xx.fbcdn.net/v/t42.1790-2/137594403_736319993929240_874913058885026076_n.mp4?_nc_cat=1&ccb=2&_nc_sid=5aebc0&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfYXVkaW9fYWFjcF82NF9mcmFnXzJfYXVkaW8ifQ%3D%3D&_nc_ohc=C-iTSoxbhqMAX8BYnq-&_nc_ht=video-cdg2-1.xx&oh=16331f01ee37393b9af46db746131a93&oe=601018E1&bytestart=115821&byteend=124431
Loaded 281 times, getEntriesByName 843 times: https://video-cdg2-1.xx.fbcdn.net/v/t39.25447-2/138627024_151850170041264_2957618579105685603_n.mp4?_nc_cat=102&ccb=2&_nc_sid=5aebc0&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjRfNXNlY2dvcF9ocTVfZnJhZ18yX3ZpZGVvIn0%3D&_nc_ohc=fvzSwSea1aUAX_DxUhc&_nc_ht=video-cdg2-1.xx&oh=1180695c22721bb0a617b2249d6e5fc7&oe=603664A3&bytestart=2150726&byteend=2983886
Loaded 340 times, getEntriesByName 682 times: https://video-cdg2-1.xx.fbcdn.net/v/t42.1790-2/138432457_772652486998580_7882847971516699134_n.mp4?_nc_cat=1&ccb=2&_nc_sid=5aebc0&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfYXVkaW9fYWFjcF82NF9mcmFnXzJfYXVkaW8ifQ%3D%3D&_nc_ohc=ArBoDLSyBiIAX8gfCAu&_nc_ht=video-cdg2-1.xx&oh=c9216545b42c4f17a508117185ea699e&oe=60100E16&bytestart=82483&byteend=83654
Loaded 281 times, getEntriesByName 843 times: https://video-cdt1-1.xx.fbcdn.net/v/t39.25447-2/140610631_226298955711491_3545714115869898952_n.mp4?_nc_cat=106&ccb=2&_nc_sid=5aebc0&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjRfNXNlY2dvcF9ocTVfZnJhZ18yX3ZpZGVvIn0%3D&_nc_ohc=cbu8TMrePBUAX96lGlD&_nc_ht=video-cdt1-1.xx&oh=a1fc0206d88ded792f8fd941f1451519&oe=6034244F&bytestart=153246&byteend=324140
Loaded 281 times, getEntriesByName 845 times: https://video-cdt1-1.xx.fbcdn.net/v/t39.25447-2/140251749_146227323977327_631208250021167542_n.mp4?_nc_cat=101&ccb=2&_nc_sid=5aebc0&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjRfNXNlY2dvcF9ocTVfZnJhZ18yX3ZpZGVvIn0%3D&_nc_ohc=KKgfgYZYVaIAX8LTAg2&_nc_ht=video-cdt1-1.xx&oh=b2bbad97ff449e09b66ef1577be3c580&oe=6034BFFC&bytestart=164301&byteend=337532
Loaded 1 times, getEntriesByName 0 times: https://www.facebook.com/ajax/bnzai?__a=1&__beoa=0&__ccg=EXCELLENT&__comet_req=1&__hsi=6922016835067559057-0&__pc=EXP1%3Acomet_pkg&__req=4h&__rev=1003221632&__s=tb5u9c%3As9srfk%3A7uum4d&__spin_b=trunk&__spin_r=1003221632&__spin_t=1611657635&__user=723363057&dpr=1&fb_dtsg=AQG0HfUkvT5o%3AAQFk28bL_ICd&jazoest=21993&ph=C3
Loaded 2 times, getEntriesByName 0 times: https://www.facebook.com/ajax/webstorage/process_keys/?state=0
Benoit, are the repeated loads expected?
Comment 30•4 years ago
|
||
Comment on attachment 9199178 [details]
Bug 1686930 - Enforce and document that mTimingData is always non-null.
Revision D103007 was moved to bug 1688941. Setting attachment 9199178 [details] to obsolete.
Comment 31•4 years ago
|
||
Comment on attachment 9199179 [details]
Bug 1686930 - Convert PerformanceResourceTiming timestamps eagerly.
Revision D103008 was moved to bug 1688943. Setting attachment 9199179 [details] to obsolete.
Comment 32•4 years ago
|
||
Canceling Benoit's needinfo again - I just saw that bug 1674295 comment 40 has basically the same question on the same topic, and Benoit is already needinfo'd over there.
Comment 33•4 years ago
|
||
Removing [qf] since this is now more like a meta bug, and other bugs which this depends on are being fixed.
Comment 34•2 years ago
|
||
This doesn't look like an S2 bug, moving to S3. Feel free to revert if you disagree.
Florian, is this interesting from a power usage perspective?
Comment 35•1 years ago
|
||
(In reply to Marco Castelluccio [:marco] from comment #34)
This doesn't look like an S2 bug, moving to S3. Feel free to revert if you disagree.
Florian, is this interesting from a power usage perspective?
It has power use implications (as any bug causing excessive CPU use does), but I think it makes more sense to look at it in terms responsiveness as the user experience seems poor. Given this bug was potentially a duplicate of bug 1655683 which has been closed, maybe we should close this one too. Nicolas, feel free to reopen if this can still be reproduced.
Updated•1 year ago
|
Description
•