Closed Bug 1748076 Opened 2 years ago Closed 1 year ago

Packaged Snap version starts 6 times slower than APT version on Ubuntu 21.10

Categories

(Release Engineering :: Release Automation: Snap, defect)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: klaus.purer, Unassigned)

References

Details

Attachments

(2 files)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:95.0) Gecko/20100101 Firefox/95.0

Steps to reproduce:

Ubuntu 21.10 installs the Snap version of Firefox per default (as requested by Mozilla according to https://news.itsfoss.com/ubuntu-firefox-snap-default/ ). This causes a Firefox cold start to be really slow, 12 seconds on my computer. You can still install the APT version of Firefox on Ubuntu 21.10, where the cold start only takes 2 seconds.

Actual results:

The Snap version of Firefox is slow, takes 12 seconds.

Expected results:

The Snap version of Firefox should start much faster, comparable to the APT version. How can we improve the Snap platform so that applications are not 6 times slower than their native APT counter parts?

This doesnt necessarily seem like a defect but more of an enhancement.
I'll change this to an enhancement and maybe someone from our dev team will pick it up when they have the time.

Status: UNCONFIRMED → NEW
Type: defect → enhancement
Component: Untriaged → Installer
Ever confirmed: true

I'm not sure that I agree. A serious performance regression sounds like a defect to me. And I would categorize this as a serious performance regression.

But I'm not sure that there is much we can do about this from the Install/Update side. Maybe the Startup people might have thoughts about what we can do about this?

Type: enhancement → defect
Component: Installer → Startup and Profile System
Product: Firefox → Toolkit

Yes I would agree that a 12 second startup on hardware that we know can achieve 2 seconds is a defect. I'm not sure we can do anything until we know what is actually slow though. Has anyone taken a startup profile to understand where the difference is?

The severity field is not set for this bug.
:mossop, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(dtownsend)

Olivier, FYI.

Component: Startup and Profile System → Release Automation: Snap
Flags: needinfo?(dtownsend) → needinfo?(olivier)
Product: Toolkit → Release Engineering
QA Contact: mtabara
Version: Firefox 95 → unspecified

I wonder if we're somehow seeing the langpack process that we worked around for MSIX builds: https://bugzilla.mozilla.org/show_bug.cgi?id=1726214?

We could really use a Gecko profile here. I don't see great instructions written down, but I think that setting MOZ_PROFILER_STARTUP=1 and setting MOZ_PROFILE_SHUTDOWN=/path/to/profile.json should be enough -- see https://searchfox.org/mozilla-central/source/intl/benchmarks/README.md.

Flags: needinfo?(klaus.purer)

I wonder if we're somehow seeing the langpack process that we worked around for MSIX builds: https://bugzilla.mozilla.org/show_bug.cgi?id=1726214?

That's an interesting thought. Certainly might be. Once Olivier responds, we can look into that.

Olivier: Basically we should move the langpacks into their own subdirectories. Then we only install the ones that correspond to the language of the operating system. We'll need to update the local service to detect that we're packaged on Linux.

Snap is notoriously slow to startup in general. There may be nothing we can do about it.

Klaus, can you confirm that it's every cold start that is slow, not just the very first time the snap is run?

(In reply to Mike Kaply [:mkaply] from comment #7)

I wonder if we're somehow seeing the langpack process that we worked around for MSIX builds: https://bugzilla.mozilla.org/show_bug.cgi?id=1726214?

That's an interesting thought. Certainly might be. Once Olivier responds, we can look into that.

Olivier: Basically we should move the langpacks into their own subdirectories. Then we only install the ones that correspond to the language of the operating system. We'll need to update the local service to detect that we're packaged on Linux.

I've just rebuilt the stable snap after moving the langpacks into their own subdirectories, but they weren't found by the application at startup. Can you point me to the where the change needs to happen for this to be effective on Linux?

Flags: needinfo?(olivier) → needinfo?(mozilla)

Can you point me to the where the change needs to happen for this to be effective on Linux?

We're going to need to update the code in the locale service that detects if we're a package app. Right now it's Windows only.

https://searchfox.org/mozilla-central/source/intl/locale/LocaleService.cpp#61

Flags: needinfo?(mozilla)

Before we try and optimize the copying/loading of langpacks, it would be useful to know for sure whether this is the main startup bottleneck.

As Mike pointed out in comment #8, there is some startup overhead induced by snapd, which needs to be investigated separately.

Olivier, does the Firefox profiler work on snap? It should be pretty easy to figure out where the firefox-specific startup time is going doing something like this.

Flags: needinfo?(olivier)

(In reply to Olivier Tilloy from comment #13)

This has been reported on the snapcraft forum too: https://discourse.ubuntu.com/t/feature-freeze-exception-seeding-the-official-firefox-snap-in-ubuntu-desktop/24210/188.

Thanks Olivier for pointing me here!

As an additional information, the startup performance of Firefox as Flatpak is really close to native on my system (5.4 | 3.6s, compared to ~3.2 as Deb). I thought I mention this in case the way the Flatpak version does it gives some hints how to optimize startup of the Snap version.

Emilio, I'm happy to report that the profiler works nicely on the snap. I'll play with it to try and capture a useful profile and see if I can get clues as to what is taking time at startup.

Flags: needinfo?(olivier)

Klaus (or anyone else seeing a significant difference in startup time with the snap), in order to measure the overhead induced by snapd, could you run the following command (cold start):

snap run --trace-exec firefox

then close the app as soon as the window is shown, and share the output here?

Mine looks like this:

Slowest 10 exec calls during snap run:
  0.561s snap-update-ns
  0.592s /usr/lib/snapd/snap-confine
  1.090s /snap/firefox/1052/usr/lib/firefox/firefox
  1.073s /snap/firefox/1052/usr/lib/firefox/firefox
  1.114s /snap/firefox/1052/usr/lib/firefox/firefox
  1.336s /snap/firefox/1052/usr/lib/firefox/firefox
  1.052s /snap/firefox/1052/usr/lib/firefox/firefox
  1.233s /snap/firefox/1052/usr/lib/firefox/firefox
  1.225s /snap/firefox/1052/usr/lib/firefox/firefox
  3.476s /snap/firefox/1052/usr/lib/firefox/firefox
Total time: 8.071s

So snap-update-ns and snap-confine do inflict an overhead of one second, which is not negligible, but not too terrible either. That's on a beefy desktop machine though, so I'd be interested in seeing what it looks like on other configurations, especially those where it is reportedly slow.

Here is my summary:

Slowest 10 exec calls during snap run:
1.113s snap-update-ns
1.188s /usr/lib/snapd/snap-confine
0.105s /snap/firefox/1025/snap/command-chain/snapcraft-runner
0.832s /snap/firefox/1025/snap/command-chain/desktop-launch
0.068s /usr/bin/cut
0.089s /snap/firefox/1025/firefox.launcher
0.114s /usr/bin/dbus-send
0.191s /usr/bin/xdg-settings
5.676s /snap/firefox/1025/usr/lib/firefox/firefox
1.592s /snap/firefox/1025/usr/lib/firefox/firefox
Total time: 28.519s

Since there were a number of error messages, I have also uploaded the complete output as an attachement.

Here is a profile for the snap's cold start on my beefy machine: https://share.firefox.dev/3pvvW1S.
I lack experience in reading and interpreting profiles, but it looks quite suspicious to me that in the interval [2.0s - 3.0s] nothing seems to be happening. Help appreciated to interpret this!

(In reply to Jan Rathmann from comment #18)

Slowest 10 exec calls during snap run:
1.113s snap-update-ns
1.188s /usr/lib/snapd/snap-confine
0.105s /snap/firefox/1025/snap/command-chain/snapcraft-runner
0.832s /snap/firefox/1025/snap/command-chain/desktop-launch
0.068s /usr/bin/cut
0.089s /snap/firefox/1025/firefox.launcher
0.114s /usr/bin/dbus-send
0.191s /usr/bin/xdg-settings
5.676s /snap/firefox/1025/usr/lib/firefox/firefox
1.592s /snap/firefox/1025/usr/lib/firefox/firefox
Total time: 28.519s

Thanks Jan. So the total overhead is 3.7 seconds. Hopefully there's room for performance improvements in snap-update-ns and snap-confine, I'll forward this data to the snapd folks. It would also be interesting to look into optimizing the desktop-launch script.

Hi folks, please note that using snap run --trace-exec will significantly slow down all processes that are executed in the snap, so it's not a fair comparison to compare the numbers from a snap run --trace-exec firefox run with that of a normal launch via i.e. snap run firefox.

I actually developed a tool to measure the start up times of graphical snaps, https://github.com/canonical/etrace (snap install etrace --classic --candidate), which takes these sort of things into account. I see the overall time to launch the snap the very first time as around 12 seconds (this is without tracing):

$ etrace exec -n=5 --no-trace --silent --cold firefox
Total startup time: 11.153576956
Total startup time: 11.675196624
Total startup time: 11.669658027
Total startup time: 12.155115321
Total startup time: 12.187042997

And with full caching turned on/left enabled:

$ etrace exec -n=5 --no-trace --silent --hot firefox
Total startup time: 2.033697433
Total startup time: 1.5373122320000001
Total startup time: 1.53470489
Total startup time: 1.540143133
Total startup time: 1.536155384

If I also install the firefox deb on my Impish machine (same as the snap measurements above), I can see that the deb launches in very close to the same amount of time:

$ etrace exec -n=5 --no-trace --silent /usr/bin/firefox
Total startup time: 1.538723434
Total startup time: 2.077962618
Total startup time: 1.5463259919999999
Total startup time: 1.551169862
Total startup time: 1.547706853

We can also measure the firefox flatpak with etrace and get basically the same startup time in the "hot cache" case:

$ etrace --use-flatpak-run --class-name=firefox exec --hot -n=5 --no-trace --silent org.mozilla.firefox
Total startup time: 1.528740561
Total startup time: 1.5433418749999999
Total startup time: 1.548117566
Total startup time: 1.539153158
Total startup time: 1.530748847

So the bug here is really that very specifically that the first launch of the firefox snap is slow, and indeed it is known that some snaps can be slower to launch the first time versus other other packaging formats.

(note that currently etrace only supports X11, so all the measurements were performed on X11, but I hope to support Wayland too soon)

Thanks,
Ian

(In reply to person.uwsome from comment #21)

So the bug here is really that very specifically that the first launch of the firefox snap is slow, and indeed it is known that some snaps can be slower to launch the first time versus other other packaging formats.

The problem is not only that a "cold" start of Firefox as Snap is very slow, but also that a "warm" startup is much slower than Deb. Here are my measurements that I first posted on Ubuntu Discourse:

Snap cold start: 20 | 23 | 19
Snap warm start: 8.8 | 9.9 | 9.8 | 10.1
Deb cold start: 3 | 3.2 | 3.2
Deb warm start: 2.2 | 2.3 (All times in seconds)

That means that a ~6-7 times slower cold start, and a ~4 times slower warm start. All measurents are with fresh profile and by plainly running 'snap run firefox' (without --trace-exec). And as mentioned, I get nearly the same startup time as native from Firefox as Flatpak, so it seems to me there is just something going wrong on the Snap side.

Really interesting that you can't reproduce the slow startup to that degree. Here is the setup I have used:

  • fresh Kubuntu Jammy installation
  • Plasma Wayland session
  • fresh Firefox profiles

I did another test by running snapped Firefox in a Gnome session (Wayland) on a different test installation of 22.04.

  • The "warm" startup time is much better on this setup, nearly the same as Flatpak or native.
  • But the "cold" startup time (I used "echo 3 > /proc/sys/vm/drop_caches") is roughly identical to my previous test, very slow (around 20s).

I ran 'top' in a terminal to observe process activity during this slow startup, and there were two Firefox processes that together used ~25% of my CPU time (running both on 1 of 4 cores?), without anything else causing any remarkable CPU load.

Another test with a fresh installation of Kubuntu to verify my observations:

  • "warm" startup is now close to native also on Plasma
  • "cold" startup still very slow, nothing changed

Tested both on Plasma Wayland and X11.

I'm testing the firefox snap in the candidate channel (100.0-1) that's built with LTO and PGO, and I am not seeing the one second interval where nothing was happening that I mentioned in comment 19. This would need more advanced testing to confirm that indeed these build-time optimizations positively affect the startup time, but it looks promising.

Updated to Ubuntu 22.04 today and the Firefox Snap version startup time is similarly slow, if not slower. Between 15 and 20 seconds on my AMD Ryzen 5 3600 6-Core Processor.

For comparison I installed the APT packaged version from ppa:mozillateam/ppa according to https://ubuntuhandbook.org/index.php/2022/04/install-firefox-deb-ubuntu-22-04/ and the startup time for that is 2 seconds.

Firefox does not exist anymore as APT package in the official Ubuntu 22.04 sources, which is really unfortunate that users now have to fall back to PPA version to have a performant Firefox experience. I worry about the security risk that users get stuck on an outdated version of Firefox if that PPA is not maintained anymore at some point in the future.

Would it be possible to maintain an official Firefox version in the Ubuntu sources again until the performance problems with Snap are solved?

An alternative would be to use the Flatpak version of Firefox and use that. It seems to be officially supported by Mozilla, so that is a plus from the security perspective. The downside is that this is yet another package manager that I need to make sure is doing updates as part of the regular Ubuntu system security updates.

Flags: needinfo?(klaus.purer)

I can't compile to test, but I think this patch should work for the snap assuming that we get the proper locale from the operating system.

Besides this change, you should package the locales into their own sub-directories, like this:

https://searchfox.org/mozilla-central/source/python/mozbuild/mozbuild/repackaging/msix.py#522

Putting each locale in it's own sub-directory under distribution called locale-{localename}

Assignee: nobody → mozilla
Flags: needinfo?(olivier)
Assignee: mozilla → nobody

We're going to track the locale stuff here:

https://bugzilla.mozilla.org/show_bug.cgi?id=1297520

Flags: needinfo?(olivier)

Where do we stand here? Is there anything actionable left to do on this bug?

Flags: needinfo?(olivier)
Flags: needinfo?(mozilla)

I think we can close this at this point and open specific issues going forward.

We have closed this gap immensely.

Status: NEW → RESOLVED
Closed: 1 year ago
Flags: needinfo?(olivier)
Flags: needinfo?(mozilla)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: