Closed
Bug 1351071
Opened 8 years ago
Closed 8 years ago
Disable startup cache generation
Categories
(Firefox Build System :: General, enhancement)
Firefox Build System
General
Tracking
(firefox55 fixed)
RESOLVED
FIXED
mozilla55
Tracking | Status | |
---|---|---|
firefox55 | --- | fixed |
People
(Reporter: catlee, Assigned: catlee)
References
(Blocks 1 open bug)
Details
Attachments
(1 file)
Jeff and I were looking at why the 52.0 -> 52.0.1 partial updates where 11MB large (for win64 en-US) where very little code had changed between the builds.
One of the largest contributors to the partial updates here was the omni.ja diff. We know that diffing omni.ja can be inefficient (bug 772868) because you're diffing compressed contents.
The startup cache files in jsloader/ are particularly bad though. If you unpack omni.ja for 52.0 and 52.0.1, and run bsdiff on each pair of the raw files and compare that to bsdiff of deflated files, you can calculate a ratio of patch size of deflated content vs patch size of raw content.
The jsloader files have patch sizes of up to 200x larger than patches of the raw files.
Looking closely at differences in these files, we can see that in the case of 52.0 -> 52.0.1, the bulk of these files are unchanged. However, the buildid of the build is encoded in the first 18 bytes of the file. Because some initial bytes are different, all of the following compressed output will also be different.
By resetting the compression stream after processing the initial 18 bytes containing the buildid, we can ensure that the bulk of the compressed contents are the same.
Doing this reduces the size of the win64 partial update from 52.0 to 52.0.1 from 11MB to 8MB.
Comment hidden (mozreview-request) |
Assignee | ||
Updated•8 years ago
|
Attachment #8851767 -
Flags: review?(mh+mozilla)
Assignee | ||
Comment 2•8 years ago
|
||
I pushed this to try here:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6fc736f8e88fcdee756f331cbdb5e91e1ff7a41e
It doesn't seem to have Talos impact (although all the runs haven't finished yet):
https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=7d3e8986a676&newProject=try&newRevision=6fc736f8e88fcdee756f331cbdb5e91e1ff7a41e&framework=1
Comment 3•8 years ago
|
||
Can't we just kill the startupcache already? There's been talk about that for a long while, and all we keep doing is trying to work around it here and there instead of just doing it.
If we don't remove the startup cache, then ISTM we should instead try to remove the buildid from those files, instead of doing horrible contorsions at the packaging level.
Assignee | ||
Comment 4•8 years ago
|
||
I'd love do.
Doing that results in huge startup IO regressions on Talos though. Who can make the call of whether we are willing to take the hit?
Comment 5•8 years ago
|
||
(In reply to Chris AtLee [:catlee] from comment #4)
> I'd love do.
>
> Doing that results in huge startup IO regressions on Talos though.
That doesn't make much sense. Small startup IO regressions might be expected, not huge ones.
Assignee | ||
Comment 6•8 years ago
|
||
Weird, the first time I tested I got huge tp5n nonmain_startup_fileio opt on win32. Looks like that could have been bug 1274018.
I did a new try push https://treeherder.mozilla.org/#/jobs?repo=try&revision=e2a4155360d36010c95b9bcc7fee731430ee4de7 to completely get rid of the startup cache.
I also see the nonmain_startup_fileio regression. https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=9577ddeaafd8&newProject=try&newRevision=e2a4155360d36010c95b9bcc7fee731430ee4de7&framework=1
Assignee | ||
Comment 7•8 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #3)
> If we don't remove the startup cache, then ISTM we should instead try to
> remove the buildid from those files, instead of doing horrible contorsions
> at the packaging level.
The buildid is encoded as part of a version check operation, presumably to ensure that the bytecode is only used by the correct version of the JS VM.
Assignee | ||
Comment 8•8 years ago
|
||
Try push here to measure our baseline performance:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9a67ebd76fcb267fb8a12fd5362d8f768e19976a
Comment 9•8 years ago
|
||
(In reply to Chris AtLee [:catlee] from comment #7)
> (In reply to Mike Hommey [:glandium] from comment #3)
> > If we don't remove the startup cache, then ISTM we should instead try to
> > remove the buildid from those files, instead of doing horrible contorsions
> > at the packaging level.
>
> The buildid is encoded as part of a version check operation, presumably to
> ensure that the bytecode is only used by the correct version of the JS VM.
... and that's only really necessary for the startup cache in the user profile, not in the files shipped in omni.ja.
Assignee | ||
Comment 10•8 years ago
|
||
Can you get the JS VM to stop emitting these bytecodes?
Comment 11•8 years ago
|
||
Nothing is impossible. But how about we only start thinking about this if we can't disable startupcache?
Assignee | ||
Comment 12•8 years ago
|
||
No significant regressions detected:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=e2a4155360d3&newProject=try&newRevision=9a67ebd76fcb267fb8a12fd5362d8f768e19976a&framework=1&showOnlyImportant=0
I think we should disable this on Nightly for all platforms and then look at Telemetry data.
Assignee | ||
Comment 13•8 years ago
|
||
Comment hidden (mozreview-request) |
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → catlee
Summary: Specialize compression for startup cache files inside omni.ja → Disable startup cache generation
Comment 15•8 years ago
|
||
mozreview-review |
Comment on attachment 8851767 [details]
Bug 1351071: Get rid of pre-generated startup cache
https://reviewboard.mozilla.org/r/123994/#review128806
::: commit-message-9577d:1
(Diff revision 2)
> +Bug 1351071: Get rid of startup cache r=glandium
We're not actually removing the startup cache, so let's go with something like "Get rid of pre-generated startup cache"
::: python/mozbuild/mozpack/test/test_packager_formats.py:411
(Diff revision 2)
> - self.assertTrue(
> + self.assertFalse(
> is_resource(base, 'jsloader/resource/gre/modules/foo.jsm'))
You should just remove this assert completely.
Attachment #8851767 -
Flags: review?(mh+mozilla) → review+
Comment hidden (mozreview-request) |
Comment 17•8 years ago
|
||
Pushed by catlee@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0f320509eddb
Get rid of pre-generated startup cache r=glandium
Comment 18•8 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 55
Comment 19•8 years ago
|
||
:catlee Could you confirm this also reduces time for the build process? There seem to be improvements here also: https://treeherder.mozilla.org/perf.html#/alerts?id=5846
Flags: needinfo?(catlee)
Assignee | ||
Comment 20•8 years ago
|
||
Yeah, I'm not surprised it slightly improves build times.
Flags: needinfo?(catlee)
Comment 21•8 years ago
|
||
We're trying to find the root cause of a 1.7ms in the 95th percentile input event responsiveness telemetry probe from the parent process in bug 1362094 and this patch is one of the three possible culprits that we currently suspect as the possible root causes.
Chris, do you think this could be the possible cause of bug 1362094? Would you be OK with a temporary backout of this bug to see if the regression is reclaimed on telemetry?
Blocks: 1362094
Flags: needinfo?(catlee)
Assignee | ||
Comment 22•8 years ago
|
||
AIUI not providing a startup cache means that we will pay a one-time cost per version of Firefox where we generate the precompiled versions of these files locally. I would have expected that to show up in startup times rather than input event responsiveness, but anything is possible.
Please feel free to backout to verify via telemetry.
A few notes:
* We were planning on disabling this for the cross-compiled OSX builds anyway (bug 1314154)
* If we do back this out permanently, I would like to revisit my idea for customizing the compression of these files to get more efficient partial updates (comment #0 on this bug).
Flags: needinfo?(catlee)
Comment 23•8 years ago
|
||
It seems unlikely that this could cause input event responsiveness delay. Startup cache is a one-time thing, that this bug moves from build-time to run-time. I don't see how this one-time thing could have an impact on recurring things.
Comment 24•8 years ago
|
||
It's true that generation of the startup cache is a one-time thing, but its presence affects things by quite a bit because of conditions like this: <https://searchfox.org/mozilla-central/rev/224cc663d54085994a4871ef464b7662e0721e83/js/xpconnect/loader/mozJSComponentLoader.cpp#686> which makes us to almost no work when the startup cache is present and you call Cu.import() which our chrome code does a lot. So if hypothetically the generation of the startup cache fails on our users machines for whatever reason at startup, then they will always be taking the slow path here, and that may be a theory that could explain the slowness we have been measuring.
Chris, if this makes sense, I'd appreciate if you can back out at your convenience. Thanks!
Comment 25•8 years ago
|
||
How would it fail on users machines?
Comment 26•8 years ago
|
||
Also note there's two level of cache, in-memory and on-disk. And the on-disk cache has two levels on its own, one of which was removed in this bug.
Comment 27•8 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #25)
> How would it fail on users machines?
I don't know.
Comment 28•8 years ago
|
||
Updated•8 years ago
|
Status: RESOLVED → REOPENED
Flags: needinfo?(catlee)
Resolution: FIXED → ---
Comment 29•8 years ago
|
||
(In reply to :Ehsan Akhgari (super long backlog, slow to respond) from comment #24)
> It's true that generation of the startup cache is a one-time thing, but its
> presence affects things by quite a bit because of conditions like this:
> <https://searchfox.org/mozilla-central/rev/
> 224cc663d54085994a4871ef464b7662e0721e83/js/xpconnect/loader/
> mozJSComponentLoader.cpp#686> which makes us to almost no work when the
> startup cache is present and you call Cu.import() which our chrome code does
> a lot. So if hypothetically the generation of the startup cache fails on
> our users machines for whatever reason at startup, then they will always be
> taking the slow path here, and that may be a theory that could explain the
> slowness we have been measuring.
FTR, as bug 1363384 comment 1 shows, if this hypothetical happens, apparently we can end up doing main thread IO. :-(
Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(catlee)
Comment 30•8 years ago
|
||
I don't think this backout has shown any change in the telemetry data, we should reland this patch...
Keywords: checkin-needed
Comment 31•8 years ago
|
||
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/68cdf62dd865
Get rid of pre-generated startup cache r=glandium
Keywords: checkin-needed
Comment 32•8 years ago
|
||
bugherder |
Status: REOPENED → RESOLVED
Closed: 8 years ago → 8 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Component: Build Config → General
Product: Firefox → Firefox Build System
Updated•6 years ago
|
Target Milestone: Firefox 55 → mozilla55
You need to log in
before you can comment on or make changes to this bug.
Description
•