Closed Bug 1337861 Opened 7 years ago Closed 7 years ago

[Fennec-Relpro] Enforce the presence of $MOZ_BUILD_DATE

Categories

(Release Engineering :: Release Automation: Other, defect, P2)

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: Sylvestre, Assigned: jlorenzo)

References

Details

Attachments

(1 file)

This caused bug 1337290
Severity: normal → major
Priority: -- → P2
Blocks: 1338477
I discussed with :catlee and :aki. With TC, we should have a simple way to force a buildId to be the same after a re-trigger.

I even have some good news! This mechanism is already in place. For instance:
1. I made a copy of this aurora nightly job [1] to [2]. 
2. In both tasks, I downloaded both en-US and multi APKs.
3. I extracted the version code with ApkTool[3]. Each pair of APK had the same version code.

I suspected version codes were generated via the env var called MOZ_BUILD_DATE. I made some other copies of [1] where:
* MOZ_BUILD_DATE pointed to an earlier date (a few hours difference)[4] => The version code got smaller than the one in [1]
* no MOZ_BUILD_DATE was provided[5] => The version code got defined on the fly with a higher version code than [1]

Hence, like :catlee suggested, we should enforce the presence of MOZ_BUILD_DATE in the payload. Once we have that, we won't be able to respin an arm APK, where the version code is gonna get increased, and be higher than the x86 one.

[1] https://tools.taskcluster.net/task-inspector/#FrcYk1__RJq4qypl0nAMng
[2] https://tools.taskcluster.net/task-inspector/#KhkcgCXmTn2VL6nhJYeyQw
[3] https://ibotpeaches.github.io/Apktool/
[4] https://tools.taskcluster.net/task-inspector/#GXp3rBiGQHqfHCu7cTPuIg
[5] https://tools.taskcluster.net/task-inspector/#DVWxJNyEQYW8HeQ0qhzDsQ
Summary: Make sure that fennec x86 versioncode > arm version code → [Fennec-Relpro] Enforce the presence of $MOZ_BUILD_DATE
I tested the patch against the date branch. [1] is a task with MOZ_BUILD_DATE specified and [2] is the same task, without.

[1] https://tools.taskcluster.net/task-group-inspector/#/Eo3n7GYSQxymcrBVd9gMLw/KCAW2S8fSKWqosIU7nidHw?_k=73k4u4
[2] https://tools.taskcluster.net/task-inspector/#Zunf03oOTtuP33mKYE7a0A/0
Assignee: nobody → jlorenzo
Comment on attachment 8840523 [details]
Bug 1337861 - [Fennec-Relpro] Enforce the presence of $MOZ_BUILD_DATE

https://reviewboard.mozilla.org/r/115010/#review117220

::: testing/mozharness/mozharness/mozilla/building/buildbase.py:808
(Diff revision 1)
>                  # MOZ_BUILD_DATE into mozharness as an environment variable, only
>                  # to have it pass the same value out with the same name.
> -                buildid = os.environ.get('MOZ_BUILD_DATE')
> +                try:
> +                    buildid = os.environ['MOZ_BUILD_DATE']
> +                except KeyError:
> +                    return self.fatal(

iirc, self.fatal doesn't return anything. It calls sys.exit
Attachment #8840523 - Flags: review?(jlund) → review+
Pushed by jlorenzo@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3816b7963bc8
[Fennec-Relpro] Enforce the presence of $MOZ_BUILD_DATE r=jlund a=release
This patch needs to make up to beta, in order to protect next Fennec beta releases.
Keywords: leave-open
Try broke because of this patch[1]. I asked Tomcat to revert patch in comment 8.

[1] https://treeherder.mozilla.org/#/jobs?repo=try&selectedJob=83004706
Backout by cbook@mozilla.com:
https://hg.mozilla.org/mozilla-central/rev/92c5b7bcd598
Backed out changeset 3816b7963bc8 on request from jlorenzo
We could potentially check for this env var in the candidates_fennec target_task_method [1], which won't be hit by try unless someone explicitly creates a decision task that calls it.

We could limit it even further to only die if it's not set on mozilla-beta or jamun, allowing try candidates_fennec to generate an automatic moz_build_date.

[1] https://hg.mozilla.org/integration/mozilla-inbound/file/tip/taskcluster/taskgraph/target_tasks.py#l245
(In reply to Aki Sasaki [:aki] from comment #12)
> We could potentially check for this env var in the candidates_fennec
> target_task_method [1], which won't be hit by try unless someone explicitly
> creates a decision task that calls it.
> 
> We could limit it even further to only die if it's not set on mozilla-beta
> or jamun, allowing try candidates_fennec to generate an automatic
> moz_build_date.
> 
> [1]
> https://hg.mozilla.org/integration/mozilla-inbound/file/tip/taskcluster/
> taskgraph/target_tasks.py#l245

Hm, if we want the check for nightlies, then we need it in the nightly_linux and nightly_fennec target_task_methods as well, which means we may want to put it in its own helper function.
Comment on attachment 8840523 [details]
Bug 1337861 - [Fennec-Relpro] Enforce the presence of $MOZ_BUILD_DATE

Link in comment 9 is broken, here's a better link [1]. Basically only the builbot runners were impacted (not the TC ones). I updated the patch to make $MOZ_BUILD_DATE mandatory only on Taskcluster. I updated the comments as well.

[2] shows try not broken with the new condition. [3] is a final check on the latest version of the patch. 

:jlund, would you mind taking another look, just to make sure my reasoning is correct.

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=e40dac7adf5739b795b14ce517ba2a2f8fdd4aff&selectedJob=89193814
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=f5c8462f14e7944158c16fe15c78fdadc905ee97
[3] https://treeherder.mozilla.org/#/jobs?repo=try&revision=2c280c6cb88c205b10964b3203732cdb4afa9a0f
Attachment #8840523 - Flags: review+ → review?(jlund)
Comment on attachment 8840523 [details]
Bug 1337861 - [Fennec-Relpro] Enforce the presence of $MOZ_BUILD_DATE

https://reviewboard.mozilla.org/r/115010/#review134948

hm, I'm shocked I missed this r? for so long :)

Anyway, I put some thoughts in line below. I don't think we need to workt too much on this since it will be continuously changed as we kill buildbot.

::: testing/mozharness/mozharness/mozilla/building/buildbase.py:798
(Diff revision 4)
>          if self.buildid:
>              return self.buildid
>  
>          buildid = None
>          if c.get("is_automation"):
> +            if self.buildbot_config['properties']:

iiuc we should protect this key lookup either with a try block or .get()

however, prior to this patch we assumed self.buildbot_config['properties'] existed even if this was a taskcluster job. Which surprises me.

I guess we call read_buildbot_config() even if we are running this job from Taskcluster. Which is wrong but out of scope of this bug.

circling back, I still think for code correctness, we should use a .get() if we are using the key lookup as a condition

::: testing/mozharness/mozharness/mozilla/building/buildbase.py:825
(Diff revision 4)
>          if not buildid:
>              self.info("Creating buildid through current time")
>              buildid = generate_build_ID()
>  
>          if c.get('is_automation'):
>              self.set_buildbot_property('buildid',

if this is taskcluster, I suppose we don't want to do this.

This logic, because of the way I wrote it, is pretty complex. We may want to have diffeent isolated blocks for these three scenarios:

1) if buildbot
2) if taskcluster
3) if outside automation

but that is probably not worth our time. I'm okay with leaving this as is for now since it doesn't break anything apparently?
Attachment #8840523 - Flags: review?(jlund) → review+
Comment on attachment 8840523 [details]
Bug 1337861 - [Fennec-Relpro] Enforce the presence of $MOZ_BUILD_DATE

https://reviewboard.mozilla.org/r/115010/#review134948

> if this is taskcluster, I suppose we don't want to do this.
> 
> This logic, because of the way I wrote it, is pretty complex. We may want to have diffeent isolated blocks for these three scenarios:
> 
> 1) if buildbot
> 2) if taskcluster
> 3) if outside automation
> 
> but that is probably not worth our time. I'm okay with leaving this as is for now since it doesn't break anything apparently?

I'm good with handling this in a follow up!
Comment on attachment 8840523 [details]
Bug 1337861 - [Fennec-Relpro] Enforce the presence of $MOZ_BUILD_DATE

https://reviewboard.mozilla.org/r/115010/#review134948

> iiuc we should protect this key lookup either with a try block or .get()
> 
> however, prior to this patch we assumed self.buildbot_config['properties'] existed even if this was a taskcluster job. Which surprises me.
> 
> I guess we call read_buildbot_config() even if we are running this job from Taskcluster. Which is wrong but out of scope of this bug.
> 
> circling back, I still think for code correctness, we should use a .get() if we are using the key lookup as a condition

`.get()` added. Yeah, I'm not sure under which circumstances `"properties"` doesn't exist.
Pushed by jlorenzo@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2f45a2547cb4
[Fennec-Relpro] Enforce the presence of $MOZ_BUILD_DATE r=jlund
Latest patch was tested on try against:
* taskcluster: https://treeherder.mozilla.org/#/jobs?repo=try&revision=9591c13a12773fe06c0436a82aec54e1d421ebb5
* buildbot: https://treeherder.mozilla.org/#/jobs?repo=try&revision=93a39b6def2ae1b96105f68a968e5dd64072b7df (the windows failure is unrelated to the patch)

Landed on autoland.
Depends on: 1360550
Backed out from release beta and inbound:

pushing to ssh://hg.mozilla.org/releases/mozilla-release
searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 1 changes to 1 files
remote: recorded push in pushlog
remote: 
remote: View your change here:
remote:   https://hg.mozilla.org/releases/mozilla-release/rev/e0d7c5a69f6a860dc6e8116a94880f1ec1037e16
remote: 
remote: Follow the progress of your build on Treeherder:
remote:   https://treeherder.mozilla.org/#/jobs?repo=mozilla-release&revision=e0d7c5a69f6a860dc6e8116a94880f1ec1037e16
remote: recorded changegroup in replication log in 0.009s
43837 files updated, 0 files merged, 5652 files removed, 0 files unresolved                                                   
pushing to ssh://hg.mozilla.org/releases/mozilla-beta
searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 1 changes to 1 files
remote: recorded push in pushlog
remote: 
remote: View your change here:
remote:   https://hg.mozilla.org/releases/mozilla-beta/rev/d92d578f631e730dd9c1f1725d2ee81ccb3ed8de
remote: 
remote: Follow the progress of your build on Treeherder:
remote:   https://treeherder.mozilla.org/#/jobs?repo=mozilla-beta&revision=d92d578f631e730dd9c1f1725d2ee81ccb3ed8de
remote: recorded changegroup in replication log in 0.006s

pushing to ssh://hg.mozilla.org/integration/mozilla-inbound
searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 1 changes to 1 files
remote: recorded push in pushlog
remote: 
remote: View your change here:
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/973bdace0cc9626ad7336d1af28beead2650fb65
remote: 
remote: Follow the progress of your build on Treeherder:
remote:   https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=973bdace0cc9626ad7336d1af28beead2650fb65
remote: recorded changegroup in replication log in 0.008s
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
First of all, I'm sorry for the second bug my patch caused. After thinking about it, I don't think it's worth enforcing the presence of that VARIABLE anymore, for a few reasons:

1. This variable is a part of the generated taskgraph[1], so that issue comes up, that's a bug in this component.
2. Even if that bug came back up, it would be caught later in the graph, during the push-apk task (which is also in the taskgraph). The push-apk task uses mozapkpublisher, which now checks the order of the version codes[2].
3. Even if we regenerated a new APK, with a wrong version code, and if the person wanted to upload it manually, mozapkpublisher will bail out.
4. My patch caused more issues than it solved.

I discussed with :catlee and we agreed to WONTFIX this patch. The original title was "Make sure that fennec x86 versioncode > arm version code" and that's implemented at [2]

[1] https://dxr.mozilla.org/mozilla-central/rev/5bc1c758ab57c1885dceab4e7837e58af27b998c/taskcluster/taskgraph/transforms/android_stuff.py#26
[2] https://github.com/mozilla-releng/mozapkpublisher/pull/17
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → WONTFIX
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: