Disable startup cache generation

RESOLVED FIXED in Firefox 55

Status

enhancement
RESOLVED FIXED
2 years ago
4 months ago

People

(Reporter: catlee, Assigned: catlee)

Tracking

(Blocks 1 bug)

unspecified
mozilla55
Dependency tree / graph

Firefox Tracking Flags

(firefox55 fixed)

Details

Attachments

(1 attachment)

(Assignee)

Description

2 years ago
Jeff and I were looking at why the 52.0 -> 52.0.1 partial updates where 11MB large (for win64 en-US) where very little code had changed between the builds.

One of the largest contributors to the partial updates here was the omni.ja diff. We know that diffing omni.ja can be inefficient (bug 772868) because you're diffing compressed contents.

The startup cache files in jsloader/ are particularly bad though. If you unpack omni.ja for 52.0 and 52.0.1, and run bsdiff on each pair of the raw files and compare that to bsdiff of deflated files, you can calculate a ratio of patch size of deflated content vs patch size of raw content.

The jsloader files have patch sizes of up to 200x larger than patches of the raw files.

Looking closely at differences in these files, we can see that in the case of 52.0 -> 52.0.1, the bulk of these files are unchanged. However, the buildid of the build is encoded in the first 18 bytes of the file. Because some initial bytes are different, all of the following compressed output will also be different.

By resetting the compression stream after processing the initial 18 bytes containing the buildid, we can ensure that the bulk of the compressed contents are the same.

Doing this reduces the size of the win64 partial update from 52.0 to 52.0.1 from 11MB to 8MB.
Comment hidden (mozreview-request)
(Assignee)

Updated

2 years ago
Attachment #8851767 - Flags: review?(mh+mozilla)
Can't we just kill the startupcache already? There's been talk about that for a long while, and all we keep doing is trying to work around it here and there instead of just doing it.

If we don't remove the startup cache, then ISTM we should instead try to remove the buildid from those files, instead of doing horrible contorsions at the packaging level.
(Assignee)

Comment 4

2 years ago
I'd love do.

Doing that results in huge startup IO regressions on Talos though. Who can make the call of whether we are willing to take the hit?
(In reply to Chris AtLee [:catlee] from comment #4)
> I'd love do.
> 
> Doing that results in huge startup IO regressions on Talos though.

That doesn't make much sense. Small startup IO regressions might be expected, not huge ones.
(Assignee)

Comment 6

2 years ago
Weird, the first time I tested I got huge tp5n nonmain_startup_fileio opt on win32. Looks like that could have been bug 1274018.

I did a new try push https://treeherder.mozilla.org/#/jobs?repo=try&revision=e2a4155360d36010c95b9bcc7fee731430ee4de7 to completely get rid of the startup cache.

I also see the nonmain_startup_fileio regression. https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=9577ddeaafd8&newProject=try&newRevision=e2a4155360d36010c95b9bcc7fee731430ee4de7&framework=1
(Assignee)

Comment 7

2 years ago
(In reply to Mike Hommey [:glandium] from comment #3)
> If we don't remove the startup cache, then ISTM we should instead try to
> remove the buildid from those files, instead of doing horrible contorsions
> at the packaging level.

The buildid is encoded as part of a version check operation, presumably to ensure that the bytecode is only used by the correct version of the JS VM.
(In reply to Chris AtLee [:catlee] from comment #7)
> (In reply to Mike Hommey [:glandium] from comment #3)
> > If we don't remove the startup cache, then ISTM we should instead try to
> > remove the buildid from those files, instead of doing horrible contorsions
> > at the packaging level.
> 
> The buildid is encoded as part of a version check operation, presumably to
> ensure that the bytecode is only used by the correct version of the JS VM.

... and that's only really necessary for the startup cache in the user profile, not in the files shipped in omni.ja.
(Assignee)

Comment 10

2 years ago
Can you get the JS VM to stop emitting these bytecodes?
Nothing is impossible. But how about we only start thinking about this if we can't disable startupcache?
(Assignee)

Comment 12

2 years ago
No significant regressions detected:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=e2a4155360d3&newProject=try&newRevision=9a67ebd76fcb267fb8a12fd5362d8f768e19976a&framework=1&showOnlyImportant=0

I think we should disable this on Nightly for all platforms and then look at Telemetry data.
Comment hidden (mozreview-request)
(Assignee)

Updated

2 years ago
Assignee: nobody → catlee
Summary: Specialize compression for startup cache files inside omni.ja → Disable startup cache generation
(Assignee)

Updated

2 years ago
See Also: → 1352595

Comment 15

2 years ago
mozreview-review
Comment on attachment 8851767 [details]
Bug 1351071: Get rid of pre-generated startup cache

https://reviewboard.mozilla.org/r/123994/#review128806

::: commit-message-9577d:1
(Diff revision 2)
> +Bug 1351071: Get rid of startup cache r=glandium

We're not actually removing the startup cache, so let's go with something like "Get rid of pre-generated startup cache"

::: python/mozbuild/mozpack/test/test_packager_formats.py:411
(Diff revision 2)
> -            self.assertTrue(
> +            self.assertFalse(
>                  is_resource(base, 'jsloader/resource/gre/modules/foo.jsm'))

You should just remove this assert completely.
Attachment #8851767 - Flags: review?(mh+mozilla) → review+
Comment hidden (mozreview-request)

Comment 17

2 years ago
Pushed by catlee@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0f320509eddb
Get rid of pre-generated startup cache r=glandium

Comment 18

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/0f320509eddb
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 55
:catlee Could you confirm this also reduces time for the build process? There seem to be improvements here also: https://treeherder.mozilla.org/perf.html#/alerts?id=5846
Flags: needinfo?(catlee)
(Assignee)

Comment 20

2 years ago
Yeah, I'm not surprised it slightly improves build times.
Flags: needinfo?(catlee)

Comment 21

2 years ago
We're trying to find the root cause of a 1.7ms in the 95th percentile input event responsiveness telemetry probe from the parent process in bug 1362094 and this patch is one of the three possible culprits that we currently suspect as the possible root causes.

Chris, do you think this could be the possible cause of bug 1362094?  Would you be OK with a temporary backout of this bug to see if the regression is reclaimed on telemetry?
Blocks: 1362094
Flags: needinfo?(catlee)
(Assignee)

Comment 22

2 years ago
AIUI not providing a startup cache means that we will pay a one-time cost per version of Firefox where we generate the precompiled versions of these files locally. I would have expected that to show up in startup times rather than input event responsiveness, but anything is possible.

Please feel free to backout to verify via telemetry.

A few notes:
* We were planning on disabling this for the cross-compiled OSX builds anyway (bug 1314154)
* If we do back this out permanently, I would like to revisit my idea for customizing the compression of these files to get more efficient partial updates (comment #0 on this bug).
Flags: needinfo?(catlee)
It seems unlikely that this could cause input event responsiveness delay. Startup cache is a one-time thing, that this bug moves from build-time to run-time. I don't see how this one-time thing could have an impact on recurring things.

Comment 24

2 years ago
It's true that generation of the startup cache is a one-time thing, but its presence affects things by quite a bit because of conditions like this: <https://searchfox.org/mozilla-central/rev/224cc663d54085994a4871ef464b7662e0721e83/js/xpconnect/loader/mozJSComponentLoader.cpp#686> which makes us to almost no work when the startup cache is present and you call Cu.import() which our chrome code does a lot.  So if hypothetically the generation of the startup cache fails on our users machines for whatever reason at startup, then they will always be taking the slow path here, and that may be a theory that could explain the slowness we have been measuring.

Chris, if this makes sense, I'd appreciate if you can back out at your convenience.  Thanks!
How would it fail on users machines?
Also note there's two level of cache, in-memory and on-disk. And the on-disk cache has two levels on its own, one of which was removed in this bug.

Comment 27

2 years ago
(In reply to Mike Hommey [:glandium] from comment #25)
> How would it fail on users machines?

I don't know.
Status: RESOLVED → REOPENED
Flags: needinfo?(catlee)
Resolution: FIXED → ---

Comment 29

2 years ago
(In reply to :Ehsan Akhgari (super long backlog, slow to respond) from comment #24)
> It's true that generation of the startup cache is a one-time thing, but its
> presence affects things by quite a bit because of conditions like this:
> <https://searchfox.org/mozilla-central/rev/
> 224cc663d54085994a4871ef464b7662e0721e83/js/xpconnect/loader/
> mozJSComponentLoader.cpp#686> which makes us to almost no work when the
> startup cache is present and you call Cu.import() which our chrome code does
> a lot.  So if hypothetically the generation of the startup cache fails on
> our users machines for whatever reason at startup, then they will always be
> taking the slow path here, and that may be a theory that could explain the
> slowness we have been measuring.

FTR, as bug 1363384 comment 1 shows, if this hypothetical happens, apparently we can end up doing main thread IO.  :-(
(Assignee)

Updated

2 years ago
Flags: needinfo?(catlee)

Updated

2 years ago
See Also: → 1364235

Comment 30

2 years ago
I don't think this backout has shown any change in the telemetry data, we should reland this patch...
Keywords: checkin-needed

Comment 31

2 years ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/68cdf62dd865
Get rid of pre-generated startup cache r=glandium
Keywords: checkin-needed

Comment 32

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/68cdf62dd865
Status: REOPENED → RESOLVED
Last Resolved: 2 years ago2 years ago
Resolution: --- → FIXED
Component: Build Config → General
Product: Firefox → Firefox Build System
Target Milestone: Firefox 55 → mozilla55
You need to log in before you can comment on or make changes to this bug.