Closed Bug 1277553 Opened 4 years ago Closed 4 years ago

Android builds are going to permafail when Gecko 49 merges to Beta (Must specify --with-adjust-sdk-keyfile when MOZ_INSTALL_TRACKING is defined!)

Categories

(Firefox Build System :: Android Studio and Gradle Integration, defect, critical)

Unspecified
Android
defect
Not set
critical

Tracking

(firefox48 wontfix, firefox49 fixed, firefox50 fixed)

RESOLVED FIXED
mozilla50
Tracking Status
firefox48 --- wontfix
firefox49 --- fixed
firefox50 --- fixed

People

(Reporter: RyanVM, Assigned: mcomella)

References

Details

Attachments

(1 file)

Nick, this looks like fallout from bug 1248066, but I'm wondering if it's a real issue or just an issue with the simulation patches. I can certainly adjust them as needed, just let me know :)

https://treeherder.mozilla.org/logviewer.html#?job_id=21773159&repo=try#L1918

20:08:37     INFO -  configure: error: Must specify --with-adjust-sdk-keyfile when MOZ_INSTALL_TRACKING is defined!
20:08:37     INFO -  DEBUG: <truncated - see config.log for full output>
20:08:37     INFO -  DEBUG: configure:16571: checking for play-services-base AAR
20:08:37     INFO -  DEBUG: configure:16586: checking for ANDROID_PLAY_SERVICES_BASE_AAR_LIB
20:08:37     INFO -  DEBUG: configure:16598: checking for ANDROID_PLAY_SERVICES_BASE_AAR_RES
20:08:37     INFO -  DEBUG: configure:16627: checking for play-services-basement AAR
20:08:37     INFO -  DEBUG: configure:16642: checking for ANDROID_PLAY_SERVICES_BASEMENT_AAR_LIB
20:08:37     INFO -  DEBUG: configure:16654: checking for ANDROID_PLAY_SERVICES_BASEMENT_AAR_RES
20:08:37     INFO -  DEBUG: configure:16683: checking for play-services-cast AAR
20:08:37     INFO -  DEBUG: configure:16698: checking for ANDROID_PLAY_SERVICES_CAST_AAR_LIB
20:08:37     INFO -  DEBUG: configure:16710: checking for ANDROID_PLAY_SERVICES_CAST_AAR_RES
20:08:37     INFO -  DEBUG: configure:16739: checking for mediarouter-v7 AAR
20:08:37     INFO -  DEBUG: configure:16754: checking for ANDROID_MEDIAROUTER_V7_AAR_LIB
20:08:37     INFO -  DEBUG: configure:16766: checking for ANDROID_MEDIAROUTER_V7_AAR_RES
20:08:37     INFO -  DEBUG: configure:16778: checking for ANDROID_MEDIAROUTER_V7_AAR_INTERNAL_LIB
20:08:37     INFO -  DEBUG: configure:17037: checking for play-services-ads AAR
20:08:37     INFO -  DEBUG: configure:17052: checking for ANDROID_PLAY_SERVICES_ADS_AAR_LIB
20:08:37     INFO -  DEBUG: configure:17064: checking for ANDROID_PLAY_SERVICES_ADS_AAR_RES
20:08:37     INFO -  DEBUG: configure:17093: checking for play-services-basement AAR
20:08:37     INFO -  DEBUG: configure:17108: checking for ANDROID_PLAY_SERVICES_BASEMENT_AAR_LIB
20:08:37     INFO -  DEBUG: configure:17120: checking for ANDROID_PLAY_SERVICES_BASEMENT_AAR_RES
20:08:37     INFO -  DEBUG: configure: error: Must specify --with-adjust-sdk-keyfile when MOZ_INSTALL_TRACKING is defined!
20:08:37     INFO -  ERROR: old-configure failed
20:08:37     INFO -  *** Fix above errors and then restart with\
20:08:37     INFO -                 "/usr/bin/gmake -f client.mk build"
20:08:37     INFO -  gmake[2]: *** [configure] Error 1
20:08:37     INFO -  gmake[2]: Leaving directory `/builds/slave/try-and-api-15-000000000000000/build/src'
20:08:37     INFO -  gmake[1]: *** [/builds/slave/try-and-api-15-000000000000000/build/src/obj-firefox/Makefile] Error 2
20:08:37     INFO -  gmake[1]: Leaving directory `/builds/slave/try-and-api-15-000000000000000/build/src'
20:08:37     INFO -  gmake: *** [build] Error 2
20:08:37     INFO -  0 compiler warnings present.
20:08:37     INFO -  WARNING: Command failed. See logs for output.
20:08:37     INFO -   # umount -n /builds/mock_mozilla/mozilla-centos6-x86_64-android/root/dev/shm
20:08:37     INFO -  WARNING: Command failed. See logs for output.
20:08:37     INFO -   # umount -n /builds/mock_mozilla/mozilla-centos6-x86_64-android/root/builds/slave
20:08:37     INFO -  WARNING: Forcibly unmounting '/builds/mock_mozilla/mozilla-centos6-x86_64-android/root/dev/shm' from chroot.
20:08:37     INFO -  WARNING: Forcibly unmounting '/builds/mock_mozilla/mozilla-centos6-x86_64-android/root/builds/slave' from chroot.
Flags: needinfo?(nalexander)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #0)
> Nick, this looks like fallout from bug 1248066, but I'm wondering if it's a
> real issue or just an issue with the simulation patches. I can certainly
> adjust them as needed, just let me know :)

I think this is an issue with the simulation patches.  We choose add an SDK keyfile at https://dxr.mozilla.org/mozilla-central/source/mobile/android/config/mozconfigs/common#55, which interrogates MOZ_UPDATE_CHANNEL.  In the logs, I see

20:08:10     INFO -    --enable-update-channel=

which suggests we don't have an update channel.  So we're setting MOZ_INSTALL_TRACKING in https://dxr.mozilla.org/mozilla-central/source/mobile/android/confvars.sh#56, but that test is different than the update channel with the simulation patches.

I'm going to redirect to mcomella to try to unify these.  Mike, perhaps we should check MOZ_INSTALL_TRACKING in the mozconfig, default to beta, and only use release if the MOZ_UPDATE_CHANNEL is "release"?  That would fallback to beta in this situation.

Suggestions welcome.
Flags: needinfo?(nalexander) → needinfo?(michael.l.comella)
Assignee: nobody → michael.l.comella
(In reply to Nick Alexander :nalexander from comment #1)
> I'm going to redirect to mcomella to try to unify these.  Mike, perhaps we
> should check MOZ_INSTALL_TRACKING in the mozconfig, default to beta, and
> only use release if the MOZ_UPDATE_CHANNEL is "release"?  That would
> fallback to beta in this situation.

I initially thought of removing the assertion but I like that the assertion gives us some compile-time safety that the keyfile exists on the builders.

I agree with Nick's approach, except I'd actually suggest adding a branch so it is:

* Release
* Beta
* (default) the sandbox key

I never found the sandbox to be broken (though it was delayed and thus less useful) so in bug 1277407, I was thinking of adding the sandbox key to the tree to make local development easier. We could reference that key in the patch for this bug (assuming we can properly access files from the source in the mozconfig – --with-adjust-sdk-keyfile appears to only use absolute paths in my local attempts).

If we can't easily access this sandbox key from the tree, we can either:
  a) upload it to the builders, like the release & beta tokens
  b) Do as Nick initially suggested
Flags: needinfo?(michael.l.comella)
Code relies on patch in bug 1277407.
Depends on: 1277407
I didn't include nalexander's MOZ_INSTALL_TRACKING suggestion because my make
skills are too shaky to make this worthwhile. Specifying a keyfile when
MOZ_INSTALL_TRACKING is disabled isn't harmful afaik (though it's a little
spammy).

Also, added code comment duplicated for emphasis:

# (bug 1277553) In Aurora -> Beta simulation builds, no update channel is
# specified, causing an assertion to throw that MOZ_INSTALL_TRACKING is
# specified but the keyfile is not. In this case, we add a default keyfile.
# This has the disadvantage that if our beta/release checks above ever
# fail, we'll come to this default case and the compile-time check to
# specify a valid keyfile will be broken. I don't have any better
# alternatives.

Review commit: https://reviewboard.mozilla.org/r/57730/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/57730/
Attachment #8759901 - Flags: review?(s.kaspari)
Ryan, if you apply the changesets from bug 1277407 and then apply the ones in this bug, I believe your simulation should pass.

I'll push to try for sanity.
Flags: needinfo?(ryanvm)
(In reply to Michael Comella (:mcomella) from comment #5)
> Ryan, if you apply the changesets from bug 1277407 and then apply the ones
> in this bug, I believe your simulation should pass.

If you pull the changesets in this bug via reviewboard, they should actually already be rebased on the patches from bug 1277407.
Attachment #8759901 - Flags: review?(s.kaspari) → review+
Comment on attachment 8759901 [details]
Bug 1277553 - Specify adjust sandbox token if not beta or release build.

https://reviewboard.mozilla.org/r/57730/#review54698
Ryan, I assume we'd want to uplift these changes to 48 so our release simulation doesn't fail too?
If the risk is low, I don't see why not.
Flags: needinfo?(ryanvm)
https://hg.mozilla.org/mozilla-central/rev/557f6bc6b70b
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 50
Comment on attachment 8759901 [details]
Bug 1277553 - Specify adjust sandbox token if not beta or release build.

This should be uplifted with bug 1277407.

Approval Request Comment
[Feature/regressing bug #]: bug 1274956.
[User impact if declined]: RyanVM's 48 beta -> 47 release simulation will permafail, as well as his 49 aurora -> 48 beta
[Describe test coverage new/current, TreeHerder]: Tested locally.
[Risks and why]: Low. In the default case, when building in automation, we specify the default keyfile from the last bug 1277407 so that an assertion is not thrown for a missing keyfile. This keeps Ryan's builds happy (which don't have a keyfile specified) and should not affect the actual release builds (which have an explicit keyfile specified).
[String/UUID change made/needed]: None
Attachment #8759901 - Flags: approval-mozilla-beta?
Attachment #8759901 - Flags: approval-mozilla-aurora?
Comment on attachment 8759901 [details]
Bug 1277553 - Specify adjust sandbox token if not beta or release build.

Fix for build issues, please uplift to beta and aurora.
Attachment #8759901 - Flags: approval-mozilla-beta?
Attachment #8759901 - Flags: approval-mozilla-beta+
Attachment #8759901 - Flags: approval-mozilla-aurora?
Attachment #8759901 - Flags: approval-mozilla-aurora+
Remember bug 1269981 / bug 1269403, which just magically went away after a month of crashing after being caused by a patch which couldn't have caused it unless it was actually the result of something like reading uninitialized memory that that patch happened to put something into?

Yeah, that's now your problem.

Backed out this and bug 1277407 in https://hg.mozilla.org/releases/mozilla-beta/rev/3c1c9ff0e307
(In reply to Phil Ringnalda (:philor) from comment #17)
> Remember bug 1269981 / bug 1269403, which just magically went away after a
> month of crashing after being caused by a patch which couldn't have caused
> it unless it was actually the result of something like reading uninitialized
> memory that that patch happened to put something into?
> 
> Yeah, that's now your problem.
> 
> Backed out this and bug 1277407 in
> https://hg.mozilla.org/releases/mozilla-beta/rev/3c1c9ff0e307

michael, see above could you take a look, thanks
Flags: needinfo?(michael.l.comella)
The patches in bug 1277407 should have no affect – we change documentation and add a text file unused by production builds.

In this bug, we specify build flags to add an sdk keyfile if it's not a release or beta build [1] and it's a production build. While it's unexpected, perhaps there are some side effects to adding the adjust keyfile, e.g. the flags are not set up properly and we're accidentally compiling adjust in on all production builds (rather than just release/beta).

[1]: More accurately, if it's not a release or beta build AND it's not one of RyanVM's beta/release simulation builds and it's a production build
(In reply to Michael Comella (:mcomella) from comment #19)
> In this bug, we specify build flags to add an sdk keyfile if it's not a
> release or beta build [1] and it's a production build. While it's
> unexpected, perhaps there are some side effects to adding the adjust
> keyfile, e.g. the flags are not set up properly and we're accidentally
> compiling adjust in on all production builds (rather than just release/beta).

But actually, I'd expect the try jobs to have the release channel set so this patch should not affect beta. Let's make some try pushes to see if we can reproduce this.
comment 17 doesn't mention which test suites are failing and the listed bugs seem to be unrelated, but after looking through the job output [1], it looks like R16 consistently fails after my patch and no longer does after my patch. Example failure output: [2]

One thought: we should verify the beta token is being used in both try & the landed changeset builds, rather than the sandbox token.

[1]: https://treeherder.mozilla.org/#/jobs?repo=mozilla-beta&fromchange=da4a9eed33bdbc5f5bd1002cc07f998a1eb037fa&filter-tier=1&filter-tier=2&filter-tier=3&exclusion_profile=false
[2]: https://treeherder.mozilla.org/logviewer.html#?job_id=1195820&repo=mozilla-beta
Side note: it appears we are making adjust pings from automation since bug 1275669 hasn't landed on beta (48) yet.
Sample R16 failure: https://treeherder.mozilla.org/#/jobs?repo=mozilla-beta&revision=2c5721ac42d71b2359320646afbdc7a919a60fd3

Relevant logcat output:
06-23 09:41:07.356  1058  1078 I Gecko   : {"action":"test_start","time":1466700067355,"thread":null,"pid":null,"source":"reftest","test":["http://10.0.2.2:8854/tests/editor/reftests/824080-6.html","==","http://10.0.2.2:8854/tests/editor/reftests/824080-6-ref.html"]}
06-23 09:41:07.366  1058  1078 I Gecko   : REFTEST TEST-LOAD | http://10.0.2.2:8854/tests/editor/reftests/824080-6.html | 785 / 912 (86%)
06-23 09:41:07.377  1058  1078 I Gecko   : 
06-23 09:41:07.377  1058  1078 I Gecko   : {"action":"log","time":1466700067376,"thread":null,"pid":null,"source":"reftest","level":"DEBUG","message":"START http://10.0.2.2:8854/tests/editor/reftests/824080-6.html"}
06-23 09:41:07.796  1058  1058 D Telemetry: SendUIEvent: event = show.1 method = content timestamp = 3913154
06-23 09:41:08.086  1058  1058 W google-breakpad: ExceptionHandler::GenerateDump cloned child 
06-23 09:41:08.086  1058  1058 W google-breakpad: 2949½y‡@˜M[¸º¾8º¾ÿÿÿÿtM[èZœA€â¤R
06-23 09:41:08.086  1058  1058 W google-breakpad: 
06-23 09:41:08.086  1058  1058 W google-breakpad: ExceptionHandler::SendContinueSignalToChild sent continue signal to child
06-23 09:41:08.126  2949  2949 W google-breakpad: ExceptionHandler::WaitForContinueSignal waiting for continue signal...
06-23 09:41:10.446    38    38 D Zygote  : Process 1058 terminated by signal (11)
---

There appears to be a SIGSEGV in there, looking at the rest of the log. There doesn't appear to be any useful info so I'll have to look at the test directly.
Looking at the logs, the failures do not consistently happen for reftest/824080-6 – they're crashes in random tests in the R16 suite. I don't know if there's anything special about R16 vs. R1-R15, but that seems suspicious to me.
try in comment 21 with all patches repro'd the R16 crash.

try here removes the patch in this bug & only includes the dependent bug's patches: https://treeherder.mozilla.org/#/jobs?repo=try&revision=51f691b06f59
try in comment 27 didn't fail – the patch in this bug does, in fact, cause R16 (and R6) to crash, seemingly inconsistently.

Perhaps we don't actually have a release channel on these builds & we're adding a token, and thus compiling in adjust, with the addition of this patch. I'm confused why this wouldn't cause issues on aurora & nightly, however.
Okay, I found this in the configure output on one of the failures:
  --with-adjust-sdk-keyfile=/builds/slave/m-beta-and-api-15-000000000000/build/mobile/android/base/adjust-sdk-sandbox.token

It seems we are unexpectedly using the sandbox token despite being on the beta tree.

Concern: are we using the beta token in released beta builds (my previous dashboard verification seemed to indicate so)?

Theory: bug 1275669 prevents us from uploading Adjust pings in automation. It's only on 49, which doesn't have these intermittents, but here we are on 48. Since we're in the "else" case for these builds, before this patch an adjust key was never included and Adjust would not be compiled in at all, making no pings from automation. However, now we have an adjust key & compile it in so perhaps the system kills us (for whatever reason) when Adjust tries to access the network (which explains why it fails in arbitrary tests).

That being said, the uplift was only to help out RyanVM's release simulation, but if we're having mysterious errors, it's probably not worth investigating this further, assuming my concern is invalid. Ideally we'd run the same builds in automation as we do for the release – I filed bug 1283318.
Flags: needinfo?(michael.l.comella)
(In reply to Michael Comella (:mcomella) from comment #29)
> Concern: are we using the beta token in released beta builds (my previous
> dashboard verification seemed to indicate so)?

I double-checked the dashboard and it does appear that we're still getting beta installs. It's possible that people are installing old builds but I doubt that's the case in such large numbers.

> Ideally we'd run the same builds in automation as we do for the release – I filed bug 1283318.

Apparently, desktop does "release promotion", which means the release builds are taken from CI so if mobile gets moved to this system, we're going to break – bug 1283318 just became a lot more important.
After investigation in bug 1283318, I think the problem is actually that MOZ_UPDATE_CHANNEL is not specified for *all* CI builds (as opposed to release builds) and that RyanVM's patches are not special. I added a comment to the build files to this effect in that bug.
See Also: → 1292580
Product: Firefox for Android → Firefox Build System
Target Milestone: Firefox 50 → mozilla50
You need to log in before you can comment on or make changes to this bug.