Open Bug 1618129 Opened 4 years ago Updated 1 year ago

Very slow session restore (profile included)

Categories

(WebExtensions :: General, defect)

Unspecified
Linux
defect

Tracking

(Performance Impact:medium, firefox75 affected)

Performance Impact medium
Tracking Status
firefox75 --- affected

People

(Reporter: yoasif, Unassigned)

Details

(Keywords: perf:frontend, perf:startup)

Attachments

(4 files, 1 obsolete file)

Basic information

I have a Firefox profile that has a few extensions in Linux that is markedly slower to start up in Linux than in Windows with a similar setup.

I don't know why that is, so I recorded a profile.


More information

Profile URL: https://drive.google.com/open?id=1P6avNt-oy-LAOk4H19DOmlJUz9pdBiai

Basic systems configuration:

OS version: Ubuntu 20.04

GPU model: Intel(R) HD Graphics 630 (KBL GT2)

Number of cores: 4

Amount of memory (RAM): 24GB

Thanks so much for your help.

Attached file about:support

WebExtensions process is taking lots of time

Component: Performance → General
Product: Core → WebExtensions

There are a ton of extensions installed here:
Reddit Enhancement Suite
Tab Counter Plus
Duplicate Tabs Closer
Ebay Container
ClearURLs
Firefox Pioneer
Bitwarden - Free Password Manager
Google Container
First Party Isolation
Extension source viewer
Page Translator Revised
Twitter Container
Display #Anchors
uBlock Origin
LinkedIn Container
Reddit (Au) Comment Highlights
Amazon Container
Always Right
enhanced-h264ify
Forget Me Not - Forget cookies & other data

Forget Me Not and Bitwarden in particular execute a lot of code in that profile. Just to help narrow this down, can you compare the time to do a session restore in that profile with those two extensions disabled?

Flags: needinfo?(yoasif)

Andrew, it is better with those two extensions disabled https://drive.google.com/open?id=17EpH3FwpiFAKwE29jxlErStsncOZvH4I

One thing I will note (and the reason I opened this bug) is because I run a very similar profile on my Windows dual boot (so same hardware), and Firefox opens up within seconds using the same session restore file.

There is something weird going on in Linux and I figured that was worth exploring.

Flags: needinfo?(yoasif)

Since you're claiming a significant difference in behavior between Windows and Linux, could you provide the profiles for both Windows and Linux? And please share it with perf-html.io link instead of Google Drive.

Flags: needinfo?(yoasif)

And please share it with perf-html.io link instead of Google Drive.

PSA: pref-html.io has moved to profiler.firefox.com

I can't upload these to profiler.firefox.com - it fails when I try.

Florian, what information is useful for diagnosing failures uploading profiles (see comment 7)?

Flags: needinfo?(florian)

I had some trouble grabbing this profile (had to reduce the startup entries because the profiler didn't pick up the profile when trying to grab it).

https://drive.google.com/open?id=1v-fJKCFO4aPmIJEstR_39FMLICgLgzrS

Once again, this failed uploading to profiler.firefox.com

Flags: needinfo?(yoasif)

Looks like the profiler.firefox.com upload failures are simply 413 (request too large)

The extension process is certainly doing a lot of work in the Linux profile, but I'm not sure that explains the extremely long delays all the outgoing web requests are experiencing (including those that are served from the cache). This session includes uBlock Origin, but from eyeballing, it doesn't look like the extension process is so bogged down that uBO can't act on web requests for the 60 seconds (!!) that many initial requests are delayed for.

Adding fxperf to get some more experienced profile-analyzing eyes on this.

Whiteboard: [fxperf]

(In reply to Andrew Swan [:aswan] from comment #10)

Looks like the profiler.firefox.com upload failures are simply 413 (request too large)

Yes, the upload failures are just the profiles being too large.

It's possible upload subsets of the profile though. Eg. https://perfht.ml/2IL3x1W shows the main thread I/O we do in the profile from comment 9 when killing unresponsive content processes.

Flags: needinfo?(florian)

There's many hundreds of setTimeout calls here in pid 3820, all seeming to come from BitWarden. Don't know if I can get the source in any way to make that make sense. It goes through browserActionSetBadgeText, then crypto_service_awaiter. I don't think that the setTimeouts themselves are responsible for the slowness, but I would certainly bet that they are indicative of something going terribly wrong inside BitWarden which is likely responsible for the slowness.

(In reply to Doug Thayer [:dthayer] from comment #12)

There's many hundreds of setTimeout calls here in pid 3820, all seeming to come from BitWarden.
...
I don't think that the setTimeouts themselves are responsible for the slowness, but I would certainly bet that they are indicative of something going terribly wrong inside BitWarden which is likely responsible for the slowness.

Comment 4 includes a profile without bitwarden. Content loads are still taking a very long time. There's still a fair amount of activity in the extension process, but that process isn't pegged at 100% even when content loads are not finishing. It does look like a lot of content processes are exiting (crashing?) in that profile.

Don't know if I can get the source in any way to make that make sense.

I don't think extensions are the issue here but for posterity, you can browse the actual contents that run in the browser here: https://robwu.nl/crxviewer/?crx=https%3A%2F%2Faddons.mozilla.org%2Ffirefox%2Fdownloads%2Ffile%2F3526255%2Fbitwarden_free_password_manager-1.43.3-an%2Bfx.xpi%3Fsrc%3Dsearch

But in this case, the background page alone is 5 MB of webpacked stuff. It is built from https://github.com/bitwarden/browser, that might be easier to read.

I was more just hopeful that the excessive setTimeout time would point to something weird going on. In the profile from comment 4, there's still almost a 60 second window in the WebExtensions process dominated by setTimeout. Is it just normal for extensions to rely on setTimeout for things such that it dominates their startup profile? Feels weird. The Windows profile is also less dominated by setTimeout calls, though they still represent an alarmingly large portion of things.

The priority flag is not set for this bug.
:mixedpuppy, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(mixedpuppy)

One thing I'll note is that launching each of the content processes seems to take over 200ms each on the Linux machine:

https://perfht.ml/2UeiOim

It looks like most of the time is spent scanning fonts?

Compare that to the launching time on Windows:

https://perfht.ml/2UJoekH

Does any of this look familiar, jfkthame?

Flags: needinfo?(jfkthame)

It seems to be spending an unexpectedly-huge amount of time in the parent under ReadSystemFontList, which is supposed to serialize the list of available fonts to pass to the new process. It's not surprising this is called, but it does surprise me to see it take ~200ms... I just profiled a launch here and saw more like 13ms there. I have no idea why it'd be taking so long on the reporter's system.

Flags: needinfo?(jfkthame)

Hi Asif,

How many fonts do you have installed on your system, out of curiosity? Also, do you have any AV software installed and enabled?

I'll also notice that in the profile, it looks like there are multiple windows here... Asif, how many windows do you tend to have open?

Flags: needinfo?(yoasif)
Attached file fc-list.txt

Mike, I have 16 windows open right now, one of which is a SSB which isn't restored on startup (kinda wish it did). This is fairly normal for me, and FWIW it is the same setup I had on Windows when I took the corresponding profile (reused my sessionstore file).

fc-list | wc -l comes back with 2870 which is kind of surprising to me, while locally installed fonts alone (fc-list|grep -i .local | wc -l) comes out to 332.

I have gone ahead and attached my fc-list output to the bug. I'm not sure how unusual this is - most of the fonts seem to be in the "Noto" family (2148!).

Also, no antivirus.

Flags: needinfo?(yoasif)

Does the fc-list data from comment #19 explain why the font initialization would be slow?

Also, is there a project tracking sharing the font list? Because from the profile, it looks like we pay this cost for the parent, the 8 child processes, then the webextension process, etc. Would be great if we could avoid that...

Flags: needinfo?(jfkthame)

Yes, there's work in progress to share the list, which ought to help here; see bug 1533462 and dependencies etc.

Asif's system does have a lot more fonts installed than my Linux box has, so it's not really comparable; I'll try installing a bunch of additional font packages and see how badly it slows down.

Flags: needinfo?(jfkthame)

Jonathan, if you are on an Ubuntu system, I think fonts-noto will get you most of the way there.

On ubuntu 18.04 (my main Linux machine), fonts-noto only installs a few hundred fonts, nothing like your big list.

I also have a VM with 20.04, but am currently having some trouble with its setup.... maybe time to reinstall.

On my end fonts-noto pulls in fonts-noto-core fonts-noto-extra fonts-noto-ui-core fonts-noto-ui-extra fonts-noto-unhinted as well.

sudo apt install fonts-noto
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  fonts-noto-core fonts-noto-extra fonts-noto-ui-core fonts-noto-ui-extra fonts-noto-unhinted
The following NEW packages will be installed:
  fonts-noto fonts-noto-core fonts-noto-extra fonts-noto-ui-core fonts-noto-ui-extra fonts-noto-unhinted
0 upgraded, 6 newly installed, 0 to remove and 18 not upgraded.
Need to get 93.1 MB of archives.
After this operation, 435 MB of additional disk space will be used.

OK, I reinstalled my 20.04 VM and was then able to install fonts-noto (and its dependencies), which gives me a fc-list | wc -l count of 2489; not quite as high, but ought to be enough to be broadly similar.

Two things puzzle me here. First, I'm still not seeing anything like as much time under gfxFcPlatformFontList::ReadSystemFontList in the parent process; your profile shows around 150-200ms there for each child process launch, whereas I'm only seeing tens of ms.

The other thing that bothers me is that the profile seems to show ReadSystemFontList calling the fontconfig function FcNameParse. That makes no sense, AFAICS; ReadSystemFontList uses FcNameUnparse, but not Parse. And my local profile indeed shows it calling the Unparse APIs, and not FcNameParse.

So there's something weird going on in your system, but I have no idea what. :(

Flags: needinfo?(mixedpuppy)

Based on feedback from Mike Conley, I went ahead and captured another profile without any apps open on my machine - I just logged out, logged in and captured a profile of my session restore. I was unable to upload it using the profiler, so you can download it here: https://drive.google.com/open?id=1TsD7AdllL-6kYn6jUM3BEk2NTEHC6-PY

Thanks!

(In reply to Asif Youssuff from comment #26)
Thanks!

I was unable to upload it using the profiler, so you can download it here: https://drive.google.com/open?id=1TsD7AdllL-6kYn6jUM3BEk2NTEHC6-PY

If we keep only the parent process and the WebExtension process, it becomes small enough to upload: https://perfht.ml/3aZR86E

Additional information from the reporter - their Linux box is encrypted using LUKS on a "f2fs" filesystem.

Hey stransky, are you familiar with LUKS? Have you heard anything about users experiencing sluggish Firefox startup with LUKS enabled?

Flags: needinfo?(stransky)

Given that you may suspect that my filesystem may be at issue here, I went ahead and copied my Firefox profile to a RAM disk to see whether it was any faster.

That profile is here: https://drive.google.com/open?id=1j5x0kMyJZU1mfcoptKsWHDiD_mam7X5n

I created a ram disk using mkdir -m0700 /dev/shm/firefox and symlinked that directory to the name of my original Firefox profile, then copied the files from my original profile to the symlinked directory.

Bizarre - the font reads, for example, are still excessive. Does this RAM disk use the LUKS encryption?

Flags: needinfo?(yoasif)

Mike, no -- it is just a straight up RAM disk that I created as I mentioned above - I don't think I did anything wrong in creating it.

There are a couple of more avenues for exploration if you think this is a more core problem to my system - I am using the "xanmod" kernel that is supposed to be optimized for end user speed (less server, more desktop) - I can try removing that.

Also, the Firefox binary was not located on the RAM disk, but I really don't see how that would make much of a difference.

Lastly, I am using "oibaf" bleeding edge graphics drivers on my machine from the latest upstream mesa. That may be causing issues. I can undo both the kernel and the mesa updates and profile again on the RAM disk.

Is it worth putting the Firefox binary on the RAM disk as well?

Flags: needinfo?(yoasif)

Okay, that seems like strong evidence that the file system is unrelated then. I guess we can cross that off the list!

Well, maybe I spoke too soon. Only the user profile went onto the RAM disk. There are likely plenty of other resources being read from disk that aren't from the user profile.

Flags: needinfo?(stransky)

Putting the needinfo back on stransky just in case he's aware of anything related to LUKS that might resemble what we're seeing here.

Flags: needinfo?(stransky)

Recorded a profile using perf. The file is a little too large for bugzilla, grab it from https://drive.google.com/open?id=1O0Bdm2hJ9C0G4QyswSRF6AlMRuYBCOmx

I also converted it to text, per https://github.com/firefox-devtools/profiler/blob/master/docs-user/guide-perf-profiling.md

https://drive.google.com/open?id=1QZEYMvEUMqIO6EY9FRm8x6SHs-wGOFAh

Made some changes since my last profile. No longer running the xanmod kernel, nor am I running the latest mesa code. I reverted both of those to the stock Ubuntu packages.

The perf profile in comment 36 works for me, thanks Asif.

This is really interesting. This looks like pretty different behaviour compared to when running with the Gecko Profiler. For example:

  1. Launching processes doesn't take nearly as long
  2. The WebExtension process seems to have very few samples during the first 10 or so seconds of its process lifetime. I'm not sure if that means its not getting serviced, the sampling thread isn't getting serviced, or if that thread is idle.

Hey jesup, you're more well-versed, I think, at reading perf profiles than I am. The WebExtension process (Process 12033), seems to do very little during the first 10 seconds of its life. Is that because it's not getting CPU time, or the sampler isn't getting CPU time, or does it mean that the thread is idle?

It's also unfortunate that I can't seem to Publish this profile, despite the fact that it seems to compress nicely. :(

ni?ing jesup for comment 38.

Flags: needinfo?(rjesup)

(In reply to Mike Conley (:mconley) (:⚙️) from comment #33)

Okay, that seems like strong evidence that the file system is unrelated then. I guess we can cross that off the list!

Well, maybe I spoke too soon. Only the user profile went onto the RAM disk. There are likely plenty of other resources being read from disk that aren't from the user profile.

In particular, presumably all the user's installed fonts are still on the encrypted disk.

But I still don't understand at all why the parent process, when launching a child, is spending all that time calling FcNameParse (comment 25). That doesn't make any sense -- unless the reporter's symbols are wrong for the fontconfig library?

I used this tool to measure the performance:

https://askubuntu.com/questions/87035/how-to-check-hard-disk-performance

and I tested luks/regular partition, both on SSD drive. luks partition gives me 1GB/s read speed and regular partition gives me 2GB/s read speed. So there's definitely a difference here.

Flags: needinfo?(stransky)

Asif, can you try the benchmark from https://askubuntu.com/questions/87035/how-to-check-hard-disk-performance ? The tool is named "Disks" on Fedora (and maybe on some later Ubuntu too).

Flags: needinfo?(yoasif)
Attached image Screenshot from 2020-04-07 04-54-44.png (obsolete) —

Benchmark attached.

Flags: needinfo?(yoasif)

(In reply to Asif Youssuff from comment #43)

Benchmark attached.

Is that really the luks file system? Looks like a regular one to me. You should run the test on a particular partition, not on the device itself.

Flags: needinfo?(yoasif)

Benchmark attached; didn't realize I could benchmark the LUKS block device, so redid the old benchmark.

Attachment #9138739 - Attachment is obsolete: true

Also attaching my root partition in case that was relevant (I have a separate home).

Flags: needinfo?(yoasif)

That looks fast enough to me....

(In reply to Mike Conley (:mconley) (:⚙️) from comment #38)

Hey jesup, you're more well-versed, I think, at reading perf profiles than I am. The WebExtension process (Process 12033), seems to do very little during the first 10 seconds of its life. Is that because it's not getting CPU time, or the sampler isn't getting CPU time, or does it mean that the thread is idle?

The webextension process was also mostly idle for the first 17s on the profile from the gecko profiler: https://perfht.ml/3aZR86E

It's also unfortunate that I can't seem to Publish this profile, despite the fact that it seems to compress nicely. :(

Here's the profile with only the parent and webextension processes. I think the url is too long to generate a tiny url, but the profile was published successfully: https://profiler.firefox.com/public/471c30fe1316585866df47c633b538dae41f5266/flame-graph/?globalTrackOrder=0-1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-18-19-20-21-22-23-24-25-26-27-28-29-30-31-32-33-34-35-36-37-38-39-40-41-42&hiddenGlobalTracks=1-2-3-4-5-6-7-8-10-11-12-13-14-15-16-17-18-19-20-21-22-23-24-25-26-28-27-29-30-31-32-33-34-35-36-37-38-39-40-41-42&localTrackOrderByPid=11808-0-1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-18-19-20-21-22-23-24-25-26-27-28-29-30-31-32-33-34-35-36-37-38-39-40-41-42-43-44-45-46-47-48-49-50-51-52-53-54-55-56-57-58-59-60-61-62-63-64-65-66-67-68-69-70-71-72-73-74-75-76-77-78-79-80-81-82-83-84-85-86-87-88-89-90-91-92-93-94-95-96-97-98-99-100-101-102-103-104-105-106-107-108-109-110-111-112-113-114-115-116-117-118-119-120-121-122-123-124-125-126-127-128-129-130-131-132-133-134-135-136-137-138-139-140-141-142-143-144-145-146-147-148-149-150-151-152-153-154-155-156-157-158-159-160-161-162-163-164-165-166-167-168-169-170-171-172-173-174-175-176-177-178-179-180-181-182-183-184-185-186-187-188-189-190-191-192-193-194-195-196-197-198-199-200-201-202-203-204-205-206-207-208-209-210-211-212-213-214-215-216-217-218-219-220-221-222-223-224-225-226-227-228-229-230-231-232-233-234-235-236-237-238-239-240-241-242-243-244-245-246-247-248-249-250-251-252-253-254-255-256-257-258-259-260-261-262-263-264-265-266-267-268-269-270-271-272-273-274-275-276-277-278-279-280-281-282-283-284-285-286-287-288-289-290-291-292-293-294-295-296-297-298-299-300-301-302-303-304-305-306-307-308-309-310-311-312-313-314-315-316-317-318-319-320-321-322-323-324-325-326-327-328-329-330-331-332-333-334-335-336-337-338-339-340-341-342-343-344-345-346-347-348-349-350-351-352-353-354-355-356-357-358-359-360-361-362-363-364-365-366-367-368-369-370-371-372~11811-0~11812-0-1-2-3-4-5-6-7-8~11884-0-1-2-3-4-5-6~11906-0~11921-0~11920-0-1-2-3-4-5-6~11976-0-1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-18-19-20-21-22-23-24-25-26-27-28-29-30-31-32-33-34-35-36-37-38-39-40-41-42-43-44~11977-0~12033-0-1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-18-19-20-21-22-23-24-25-26-27-28-29-30-31-32-33-34-35-36-37-38-39-40-41-42-43-44-45-46-47~12034-0~12092-0-1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-18-19-20-21-22-23-24-25-26-27-28-29-30-31-32-33-34-35-36-37-38-39-40-41-42-43-44-45-46-47-48-49-50-51-52-53-54-55-56-57~12105-0~12104-0-1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-18-19-20-21-22-23-24-25-26-27-28-29-30-31-32-33-34-35-36-37-38-39-40-41-42-43-44-45~12132-0-1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-18-19-20-21-22-23-24-25-26-27-28-29-30-31-32-33-34-35-36-37-38-39-40-41-42-43-44-45-46-47-48-49-50-51-52-53-54-55-56~12133-0~12093-0~12179-0-1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-18-19-20-21-22-23-24-25-26-27-28-29-30-31-32-33-34-35-36-37-38-39-40-41-42-43-44-45-46-47-48-49-50-51-52-53-54-55-56-57-58-59-60-61-62-63~12213-0-1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-18-19-20-21-22-23-24-25-26-27-28-29-30-31-32-33-34-35-36-37-38-39-40-41-42-43-44-45-46-47-48~12214-0~12250-0-1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-18-19-20-21-22-23-24-25-26-27-28-29-30-31-32-33-34-35-36-37-38-39-40-41-42-43-44-45-46-47-48-49-50-51-52-53-54-55-56-57~12251-0~12180-0~12268-0-1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-18-19-20-21-22-23-24-25-26-27-28-29-30-31-32-33-34-35-36-37-38-39-40-41-42-43-44-45-46~12269-0~12454-0-1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-18-19-20-21-22-23-24-25-26-27-28-29-30-31-32-33-34-35-36-37-38-39-40-41-42-43-44-45-46-47-48-49-50~12455-0~12724-0~12726-0~12728-0~12729-0~12730-0~12731-0~12733-0~12735-0~12737-0~12738-0~12739-0~12740-0-1~12743-0-1-2-3-4~12749-0-1~12752-0-1~13006-0~&thread=0&timelineType=stack&v=4

fxperf:p2 while we're still figuring this out, we can re-evaluate once we've figured out a/the/some culprit(s).

Whiteboard: [fxperf] → [fxperf:p2]

The thread is idle during that time; the master process mainthread is pretty busy (~70%?), but you have multiple cores I presume. Also if a thread is running-but-suspended, it will show as running in perf, and it's not being shown as running in the webextensions process

Flags: needinfo?(rjesup)
Performance Impact: --- → P2
Whiteboard: [fxperf:p2]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.