FOG Artifact Build Support appears to not be working
Categories
(Toolkit :: Telemetry, defect, P1)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox113 | --- | fixed |
People
(Reporter: chutten, Assigned: chutten)
References
Details
Attachments
(2 files)
Simple things like adding a new event and then trying to use that new event in a module and in a test aren't working.
Investigate how far this goes (is it just xpcshell-test? Is it everything?) and get it fixed.
| Assignee | ||
Comment 1•2 years ago
|
||
Happens on ./mach run too, so it's not test-only.
| Assignee | ||
Comment 2•2 years ago
|
||
Refresher:
- The system responsible for this is called JOG
- It runs on the main thread on the first access via JS to any named property under the
GleanorGleanPingsglobal - It will only perform work if
- the build was built with
MOZILLA_OFFICIALnot set - the build
IsDevelopmentBuild - the file of metrics definitions for artifact build consumption is at
GreD/jogfile.json- and it is validly formatted
- and it contains the metric or ping you expect
- the build was built with
I have checked:
- The build isn't official (
AppConstants.MOZILLA_OFFICIALisfalse) - The build should be
IsDevelopmentBuild(There's noomni.jathat I can find) - The jogfile is in objdir/dist/bin (GreD (the appropriate position))
- It appears to be validly formatted (We have error-level logging for failure to parse the jogfile, and none of them log)
- It indeed contains the metric I expect.
The next thing I did was fake an artifact build by copying jogfile.json to objdir/dist/bin/. Alas, that actually works just fine (anything I change in jogfile.json appears just fine when I check about:glean).
Now to try out debug artifact builds so I can step through and see what's what.
| Assignee | ||
Comment 3•2 years ago
|
||
Despite having debug artifacts and symbols (thank you docs), without file and line, I can't step. I set a breakpoint on JOG::EnsureRuntimeMetricsRegistered and jog_load_jogfile and confirm that the former is definitely getting hit frequently (as expected, since it's on the NamedGetter for both the Glean and GleanPings globals).
Alas, jog_load_jogfile is never hit.
With the debug artifact build I can at least confirm none of the NS_WARN_IF cases are being hit inside EnsureRuntimeMetricsRegistered. This either means
- despite
AppConstants.MOZILLA_OFFICIALbeing false when I./mach runand check the runtime value,MOZILLA_OFFICIALmust've been defined during build meaning this ifdef gets hit and we early return. - despite there being no
omni.jato be found in the objdir,IsDevelopmentBuildis returningfalse.
I'm not sure how to narrow it down further, but it's the end of the week so I guess that's a question for Monday.
| Assignee | ||
Comment 4•2 years ago
|
||
FOG's various modules (including jog) haven't taken full advantage of MOZ_LOG.
Time to use a weird "artifact builds aren't working in FOG" problem as an
excuse to get it done.
Also, while I'm here, was getting weird debug-only crashes because of emplace
on a None. Not at all the behaviour I expected from mozilla::Maybe, but the
fixed syntax is even more expressive, so I'm all for it.
| Assignee | ||
Comment 5•2 years ago
|
||
All my efforts to debug and step through artifact builds have failed. My Linux machine has been having problems with finding sources and symbols for some time, but even following the Debugging on Windows docs gets me either a windbg or a Visual Studio who claims that firefox is running without a xul.dll, meaning that there's absolutely no hope that it'll be able to find my code to break on it.
So I'm taking a different tack. This patch adds logging that should, when enabled on an artifact build, tell me exactly where things are going awry. And that'll be the next step on the path to fixing this.
Comment 7•2 years ago
|
||
| bugherder | ||
| Assignee | ||
Comment 8•2 years ago
|
||
2023-03-08 16:56:54.538264 UTC - [Parent 522785: Main Thread]: V/jog MOZILLA_OFFICIAL build. No JOG for you.
From this line, this means these artifacts were build with MOZILLA_OFFICIAL.
ni?nalexander - I thought we could use MOZILLA_OFFICIAL as a compile-time check to ensure our (expensive main thread I/O using) artifact build support only ran definitely on artifact builds. Is it no longer the case that these artifacts are unofficially-built? (was it always the case and I somehow failed to notice?)
Comment 9•2 years ago
|
||
(In reply to Chris H-C :chutten from comment #8)
2023-03-08 16:56:54.538264 UTC - [Parent 522785: Main Thread]: V/jog MOZILLA_OFFICIAL build. No JOG for you.From this line, this means these artifacts were build with
MOZILLA_OFFICIAL.ni?nalexander - I thought we could use
MOZILLA_OFFICIALas a compile-time check to ensure our (expensive main thread I/O using) artifact build support only ran definitely on artifact builds. Is it no longer the case that these artifacts are unofficially-built? (was it always the case and I somehow failed to notice?)
Oh dear: I think this has always been broken. This specific conditional is in a compiled code artifact, i.e., in libxul.so. Artifact builds fetch compiled code artifacts from upstream official builds. So this conditional has always been in place; we just missed it the first time around.
I'm not aware of an existing way to "reach up" from the compiled code and ask the intepreted code if it's an artifact build, i.e., to query AppConstants.sys.mjs, which will have the correct value. Any of the mechanisms to do so (JS XPCOM components and services come to mind) would suffice. It would also work to have some early JS expose an environment variable or set a native-code flag with this value.
:glandium: am I missing some existing mechanism or pattern?
| Assignee | ||
Comment 10•2 years ago
|
||
If we're moving from compile-time-knowable to only run-time-knowable, I'm already asking IsDevelopmentBuild() (which checks for an omnijar). That looks cheap enough we could move it up to MOZILLA_OFFICIAL's place, though honestly the Maybe<bool> sFoundAndLoadedJogfile; is probably pretty quick, too.
So what I'm really interested in is if there's existing compile-time-knowable mechanisms or patterns.
...and if IsDevelopmentBuild() is fit for purpose.
...and if this might cause any problems with artifact builds on try.
Comment 11•2 years ago
|
||
(In reply to Chris H-C :chutten from comment #10)
If we're moving from compile-time-knowable to only run-time-knowable, I'm already asking
IsDevelopmentBuild()(which checks for an omnijar). That looks cheap enough we could move it up toMOZILLA_OFFICIAL's place, though honestly theMaybe<bool> sFoundAndLoadedJogfile;is probably pretty quick, too.
It's possible to have packaged artifact builds, so this isn't quite what you want. More below.
So what I'm really interested in is if there's existing compile-time-knowable mechanisms or patterns.
No, by definition there is not: the same compiled code artifacts are used in two different contexts -- a full build and an artifact build. Same libxul.so under the hood.
...and if
IsDevelopmentBuild()is fit for purpose.
I think not.
...and if this might cause any problems with artifact builds on
try.
Yes, it will: Ba artifact builds in automation are packaged like any other builds.
Packaged (== has an omnijar) is not the indicator you are looking for. We've resisted this type of indicator at various times because it's tricky to get right and other proxy indicators have sufficed. Let's wait for :glandium to opine.
Comment 12•2 years ago
|
||
Oh boy, IsDevelopmentBuild is such a misleading name...
Can someone give more context as to what the goal is here, because comment 2 doesn't help me much, unfortunately.
| Assignee | ||
Comment 13•2 years ago
•
|
||
Certainly. FOG and Glean, like Firefox Desktop Telemetry before them, depend on codegen to work: we use it to generate APIs and structures and constants and stuff to support instrumentation.
However, not all our developers compile as part of their desired workflow, preferring artifact builds. So FOG, wanting to still support these developers' instrumentation needs like Firefox Desktop Telemetry before it, needs to do some (expensive) work to support new or changed JS instrumentation that is registered not at compile-time, but at run-time. The subsystem that does this work for FOG is called Runtime Metric Definition Subsystem JOG.
We want to only do this expensive work in cases where it's necessary. So what we're in the market for is some means to know, preferably at compile time and in C++, that a given build is or isn't A Developer's Build in Artifact Mode (or CI's artifact builds, which have similar needs). We never want to run the expensive work on a released build.
Is that sufficient context?
Comment 14•2 years ago
|
||
I'm afraid there is no solution in sight for this. The artifacts that are downloaded by developer builds in artifact mode are shippable builds. Whatever you'd add on the C++ side to do what you want would be shipped as well.
Comment 15•2 years ago
|
||
If you expose something like MOZ_ARTIFACT_BUILDS to js, then the js side could trigger whatever it needs if the necessary (opt-in) entry point is made available in all builds.
| Assignee | ||
Comment 16•2 years ago
•
|
||
Alas, there is no JS side to FOG (at least not yet). Due to (IIRC) needing to have a wildcard named getter the Glean and GleanPings globals are implemented in C++. I'm not sure if pulling JS-known information like AppConstants can be done and is advisable to do from C++?
We do have some XPCOM in nsIFOG... I could try putting the MOZ_ARTIFACT_BUILDS knowledge in a JS-implemented method on the FOG service?
Speaking of MOZ_ARTIFACT_BUILDS, I couldn't find docs on it but I gather it's the build-time define we should be using instead of COMPILE_ENVIRONMENT to identify we're in the "build" step of an artifact build? If so, I think I should file a follow-up to move away from COMPILE_ENVIRONMENT.
Comment 17•2 years ago
|
||
MOZ_ARTIFACT_BUILDS is a superset of COMPILE_ENVIRONMENT, and they are not equivalent. But for identifying artifact builds, yes, you should use the former rather than the latter.
| Assignee | ||
Comment 18•2 years ago
|
||
FOG needs to trigger JOG (artifact build support for runtime-defined metrics.
Allows changes and additions to metrics.yamls to be usable from JS without a
compile step) when running an Artifact Build.
So we take MOZ_ARTIFACT_BUILDS and pop it into a pref for FOG's use in C++.
Comment 19•2 years ago
|
||
Comment 20•2 years ago
|
||
| bugherder | ||
| Assignee | ||
Comment 21•2 years ago
|
||
(Whoops, sorry about leaving this left open. In penance I just downloaded the latest tree and confirmed it's all working.)
Updated•2 years ago
|
Updated•2 years ago
|
Description
•