Closed Bug 1551639 Opened 5 months ago Closed 4 months ago

Need a BuildID that updates even for local builds

Categories

(Firefox Build System :: General, enhancement)

enhancement
Not set

Tracking

(firefox69 fixed)

RESOLVED FIXED
mozilla69
Tracking Status
firefox69 --- fixed

People

(Reporter: tcampbell, Assigned: glandium)

References

(Blocks 2 open bugs)

Details

Attachments

(3 files)

Various SpiderMonkey caches must be invalidated on numerous changes to SpiderMonkey code (or build options like DEBUG). These systems currently use mozilla::GetBuildId which works in official builds but is not updated on local incremental compiles.

The SpiderMonkey cases don't particularly care about the format of the ID, simply that it changes on every build affecting XUL. Something like the hash of xul.dll would be an interesting ID if it were available.

This is currently wasting a lot of (non-SpiderMonkey) developers time when they hg pull and do an incremental build. Their caches become corrupt and a lot of time is spent debugging.

It seems like we need some string that changes even on incremental builds. I understand concerns that we don't want buildid to change on literally every incremental build b/c that means re-linking xul, but what if we had some new string (that our caches could use in place of the buildid string) that changed only when XUL changes (and is thus being relinked anyway)?

It sounds like this has already wasted a significant amount of developer time and we can expect this to continue as we do more aggressive/fancy caching throughout the browser. This problem is outside the scope of SpiderMonkey; Anthony, perhaps you can find someone who knows the build system to help us here?

Flags: needinfo?(ajones)
Flags: needinfo?(ajones) → needinfo?(erahm)
Blocks: 1506263

ajones thinks glandium might have thoughts on the best way to solve this.

Flags: needinfo?(mh+mozilla)
Flags: needinfo?(erahm)

Presumably, your spidermonkey caches need to be invalidated on some specific changes to spidermonkey. Why not "version" those at the code level?

Flags: needinfo?(tcampbell)
Flags: needinfo?(mh+mozilla)
Flags: needinfo?(luke)

In general, any change in SM can result in a change to what gets cached; it's not just modifications to a small set of serialization/deserialization routines. Historically there was an explicit "XDR version" that had to be bumped manually, but it had even more problems and so SM moved away from that to automatically invalidating on each new build. Also note that, while SM is the only current user of Necko's alt-data for caching, more components can/should use this feature in the future, so it would be valuable to have a solution that isn't custom to SM.

Flags: needinfo?(luke)

That's an important question. The issue spidermonkey is facing is that these caches contain memcpy'd compile artifacts and the "specific changes" have historically been hard to quantify. The rough approximation is that any changes in the front end (parser/bytecode emitter) and any changes in code that parser links to in the VM potentially require version bumping. The data format is internal, has no spec, and (indirectly) changes weekly. Historically it has been difficult to keep manual version bumping correct for this case, and we have since increased our surface area to include content JS (and thus preimage concerns).

Currently, the buildid changes in official builds are keeping things sane for end-users, but local developer builds are running into trouble when then they 'hg pull' and then continue an incremental build.

Some problems I was running into which lead me to asking for a more robust system:

  • Using a GENERATED_FILES dependency on the source files to SHA1 or otherwise generate a build ID required manually maintaining a second list of sources.
  • Having that GENERATED_FILES target use SOURCES/UNIFIED_SOURCES directly was leaving out header files.
  • Using the generated .a as an input dependency (which we successfully use for some adhoc static analysis) had trouble with the effective circular dependency when generating a version number to re-link into the library.
  • Having the generated .a as in input to the buildid.h worked for incremental builds, but the build system couldn't build spidermonkey on demand to create buildid.h in the first place.
  • Leveraging the .purgecaches trick requires tweaking Necko to toss all its caches on spidermonkey changes and they generally don't invalidate that cache because they rarely mess with the HTTP cache format. For reference, the way that Spidermonkey currently communicates this with Necko is to pass the BuildID (from toolkit) as a MIME type to necko alt-data stream requests.

Some imaginary (but probably not practical ideas) that would help:

  • Have the xul.dll link step recreate buildid.o as a pre-link step each time
  • Having the hash of xul.dll available to toolkit
  • Generate buildid on non-artifact builds since linking is faster than a few years ago
  • Having a .buildid file similar to .purgecaches for unofficial builds that is updated by each build
  • .. ?
Flags: needinfo?(tcampbell)

Tangentially, what happens if you run win32 Firefox, then win64 Firefox with the same buildid? (which happens if you switch from one to the other on the release channel)

Currently, the XDR data is independent of word size and always little-endian. I'm not a fan of maintaining this compatibility (and WASM for example has extra CPU feature tests for this stuff). There may be more justification for the examples like the one you gave of 32 vs 64 on the same host machine, but sharing profile cache data across different machine architectures seems better of just not allowing.

One area that I'm not sure about currently if browser pref changes might lead to trouble. Unrelated work to make parsing more of a pure function should help with this concern long term as parser context will better be involved in cache keying functions.

Speaking hypotheticals here, would the buildid updating on every build work for you? (Even when there are no changes, and even when there are no C++ changes)

(In reply to Mike Hommey [:glandium] from comment #8)

Speaking hypotheticals here, would the buildid updating on every build work for you? (Even when there are no changes, and even when there are no C++ changes)

That would be fine for our purposes (answering for Ted/Luke because time zones).

Keeping this on my radar.

Flags: needinfo?(mh+mozilla)
Assignee: nobody → mh+mozilla
Depends on: 1556662
Flags: needinfo?(mh+mozilla)

There is a big difference between generated source files that are built
directly, and those that are only included.

In the latter case, the build system won't know the files that does the
including depends on the generated source. So those sources do need to
be built during the export tier.

But in the former case, the build system has all the dependency
information it needs, and, while these generated sources are currently
built as part of the export tier, they don't actually need to. We're
going to change that, and in preparation, we rename included files
so as to be more clearly identified.

As established previously, generated source files that are not included
don't need to be built during the export tier.

(In reply to Mike Hommey [:glandium] from comment #8)

Speaking hypotheticals here, would the buildid updating on every build work for you? (Even when there are no changes, and even when there are no C++ changes)

There are dependencies from Android-land on the build ID; bumping it on every build might require some accommodations. For example, https://searchfox.org/mozilla-central/source/mobile/android/base/generate_build_config.py plumbs the build ID into Fennec's build configuration, and it's also in GeckoView's BuildConfig. Touching that will force the entire Java piece to recompile, and that happens very early, because the Java compilation produces the generated JNI wrappers and SDK bindings. Hopefully the existing code recognizes there are no chagnes to those wrappers and bindings and doesn't produce further native code compilation.

Whatever happens here we need to test the Android case thoroughly enough to be sure that we're not regressing the GeckoView workflow dramatically.

I'm going to sidestep the Android case by making this gated on MOZ_BUILD_APP not being mobile/android, and will file a separate bug to figure things out for Android.

Blocks: 1557215
Pushed by mh@glandium.org:
https://hg.mozilla.org/integration/autoland/rev/c529582f1250
Use .inc suffix for generated source files that are only included. r=nalexander
https://hg.mozilla.org/integration/autoland/rev/77a5c54083fa
Don't build generated source files during the export tier. r=nalexander
https://hg.mozilla.org/integration/autoland/rev/5b02001291e1
Always update buildid but avoid rebuilding libxul. r=nalexander

\o/ Thanks!

Regressions: 1564897
You need to log in before you can comment on or make changes to this bug.