Closed Bug 899969 Opened 7 years ago Closed 6 years ago

mozilla-central should contain a pointer to the revision of all external B2G repos, not just gaia

Categories

(Release Engineering :: General, defect, P2, major)

Tracking

(Not tracked)

RESOLVED FIXED
B2G C4 (2jan on)

People

(Reporter: emorley, Assigned: catlee)

References

Details

Attachments

(14 files, 1 obsolete file)

23.82 KB, patch
aki
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
14.58 KB, patch
catlee
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
2.24 KB, patch
emorley
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
895 bytes, patch
aki
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
4.51 KB, patch
rail
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
2.97 KB, patch
bhearsum
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
1.71 KB, patch
aki
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
5.38 KB, patch
aki
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
16.96 KB, patch
aki
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
2.45 KB, patch
aki
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
2.26 KB, patch
rail
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
4.75 KB, patch
bhearsum
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
1.94 KB, patch
bhearsum
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
5.29 KB, patch
RyanVM
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
Today, bug 895299 caused B2G device image builds to fail on multiple trees, which required sheriffs to manually pick through the github mozilla-b2g repos looking for recent commits, in order to find the culprits (in the end we needed backouts out of three repos). This isn't the first time this kind of thing has happened - see also bug 897611.

This is pretty frustrating for sheriffs, particularly since we don't know much of the B2G build process well, nor are we as familiar with what's recently landed as say gecko changes. In addition, the non-deterministic nature of these failures means that the B2G device builds should really not be shown by default on TBPL:
https://wiki.mozilla.org/Sheriffing/Job_Visibility_Policy#8.29_Must_avoid_patterns_known_to_cause_non_deterministic_failures

As such, we should peg the version of the various b2g device repos in a manfiest and have a bot automatically update them (ie bug 868598 except for more than just gaia).

Thanks! :-)
Although this isn't directly a fix to this problem, rel-eng has recently started including links to the sources.xml produced by each build in TBPL.  I've been working on a little Python utility at https://github.com/jonallengriffin/sourcesdiff that can be used to produce a list of commits that were made across all repos, given two arbitrary sources.xml files.

Using this tool will at least mean you won't have to pick through large numbers of B2G repos to figure out what has changed between two builds.

I'll document this a bit better and make a post to dev.b2g about it.
Ah indeed, I'd forgotten about that bug. Thank you :-)
Depends on: 872628
Bug 910662 is an example of why we really need this - I'm currently having to search through the log, manually copy/paste/diff the sources manifest & even then  it's not clear where the repos are in question, so I still can't do anything to fix the breakage :-(
Plus it's affecting multiple trees (I really should have closed all of them, or else just hidden the Leo device image builds; but either way I'll get moaned at) - whereas if the revision were pinned, we'd only be breaking one tree.
Rough proposal:
* Add a manifest to the tree (json or xml don't mind, we're using both at the moment in various places) at https://hg.mozilla.org/mozilla-central/file/tip/b2g/config/manifest.json
* This manifest replaces gaia.json and lists not only the gaia repo + revision, but *all* other non-mozilla-central repositories that make up the B2G build. eg: device-{inari,leo,...}, repos from codeaurora.org... basically anything currently listed in sources.xml.
* The build would then only pull the specified revisions of those repos, and not tip.
* A bot would periodically update manifest.json on b2g-inbound with updated revisions (just as we already do for gaia.json). These commits would be visible to sheriffs (massive win) and also mean that if something broke, the bustage is constrained to just one tree (b2g-inbound) and not all trees as we have at present.
* In times of breakage, the sheriffs would use these manifest.json commits to easily figure out which repo needs a backout, perform that backout upstream, and then the bot would auto-commit the updated rev into b2g-inbound = profit.
To make this solution complete, the sheriffs would also need the ability to disable the bot from updating specific repos in the manifest, since there are many repos that the sheriffs cannot back commits out from.
Rather than inventing a few format, I would put the XML manifest used by repo in the tree. Perhaps already transformed from scary internet urls to friendly git.m.o urls?
Blocks: 910685
Blocks: 910745
More multi-tree breakage today (eg https://tbpl.mozilla.org/php/getParsedLog.php?id=27223986&tree=Mozilla-Inbound) from the staggered landing of bug 907745 / bug 910928.
I think we can do this in b2g_build.py and some kind of pushbot to keep in-tree manifests up to date.
Assignee: nobody → catlee
Component: Builds → General Automation
Product: Boot2Gecko → Release Engineering
QA Contact: catlee
Duplicate of this bug: 910201
A few things going on here:
- b2g_make_manifest.py is what we can use to generate the sources.xml that would go in-tree from the manifests in b2g-manifests.git. I think it will end up running together with the gaia.json updater so we only end up with one push to b2g-inbound per manifests update.

- b2g_build.py supports reading in-tree manifests. It looks for 'b2g_intree_manifest' in gecko's config.json for that device. If set, it assumes that sources.xml is present, and uses that manifest instead. We end up checking gecko out first so we can grab the manifest.

- b2g_build.py stops using repo's mirror functionality and instead symlinks the local .repo/ directory to /builds/git-shared/repo. This looks like it speeds up 'repo sync' quite a bit since we don't have to fetch tags into two sets of git repos.
Attachment #804423 - Flags: review?(aki)
Comment on attachment 804423 [details] [diff] [review]
Support in-tree sources.xml

We may want a way to run the script across multiple manifests (manifest_name/manifest_file as an extend array rather than a store string).  That would probably make rewrite_manifests as an action that calls the various manifest-specific methods per-manifest.  Until then it looks like you're planning on using a wrapper, which is a fine approach.
Attachment #804423 - Flags: review?(aki) → review+
\o/ :-)
Attachment #804423 - Flags: checked-in+
merged to production...again...
not working on this currently

to finish up, we need to run b2g_make_manifest.py for each device supported on various branches. that's probably b2g-inbound and b2g26 right now. This should run at the same time as bump_gaia_json.py so we end up making only a single commit or push to each repo.
Assignee: catlee → nobody
Priority: -- → P2
This is very similar to b2g_make_manifest.py, but I wanted a clean start, hence a new script. We can remove b2g_make_manifest any time, it's currently unused.

One question is whether a single invocation and config file limit itself to one branch, or if we should try and handle multiple branches at once? The 1.2 b2g branch needs to get stuff landed on the b2g26_v1.2 gecko branch. I lean towards having the script and configs handling one branch at a time so we can run them in parallel if required.
Attachment #8344836 - Flags: feedback?(aki)
Forgot to mention that I had this running against b2g-inbound in my user repo:
http://hg.mozilla.org/users/catlee_mozilla.com/b2g-inbound
Comment on attachment 8344836 [details] [diff] [review]
b2g bumper script

(In reply to Chris AtLee [:catlee] from comment #16)
> This is very similar to b2g_make_manifest.py, but I wanted a clean start,
> hence a new script. We can remove b2g_make_manifest any time, it's currently
> unused.

Do you want to? Want me to?

> One question is whether a single invocation and config file limit itself to
> one branch, or if we should try and handle multiple branches at once? The
> 1.2 b2g branch needs to get stuff landed on the b2g26_v1.2 gecko branch. I
> lean towards having the script and configs handling one branch at a time so
> we can run them in parallel if required.

I dealt with this in gaia bumper by having a repo_list:
http://hg.mozilla.org/build/mozharness/file/75801cf30d51/configs/gaia_bumper/gaia_json.py#l15

That lets us choose to run serially or parallel or both.
That may add significant complexity here, though, so if you really don't see us wanting to run this serially, we can skip that work.

>+class B2GBumper(VCSScript):
>+
>+    def __init__(self, require_config_file=True):
>+        super(B2GBumper, self).__init__(
>+            all_actions=[
>+                'clobber',
>+                'checkout-gecko',
>+                'checkout-manifests',
>+                'massage-manifests',
>+                'commit',
>+                'push',
>+                'push-loop',
>+            ],
>+            default_actions=[
>+                'push-loop',
>+            ],

Another use case for https://bugzilla.mozilla.org/show_bug.cgi?id=873273 !
Which will probably sit on the back burner for a while longer.

>+    def massage_manifests(self):
>+        """
>+        For each device in config['devices'], we'll strip projects mentioned in
>+        'ignore_projects', or that have group attribute mentioned in
>+        'filter_groups'.
>+        We'll also map remote urls
>+        Finally, we'll resolve absolute refs for projects that aren't fully
>+        specified.
>+        """
>+        for device, device_config in self.config['devices'].items():
>+            self.info("Massaging manifests for %s" % device)
>+            manifest = self.query_manifest(device)
>+            self.filter_projects(device_config, manifest)
>+            self.filter_groups(device_config, manifest)
>+            self.map_remotes(manifest)
>+            self.resolve_refs(manifest)
>+            repo_manifest.cleanup(manifest)
>+            self.device_manifests[device] = manifest
>+
>+            manifest_path = self.query_manifest_path(device)
>+            self.write_to_file(manifest_path, manifest.toxml())

Bonus points for looking for git.m.o repos that don't exist, and notifying!
What happens in that case here? We can't figure out the ref for the repo?

>+    def push_loop(self):
>+        while True:
>+            self.checkout_gecko()
>+            self.checkout_manifests()
>+            self.massage_manifests()
>+            if not self.commit():
>+                # Nothing changed, we're all done
>+                self.info("No changes - all done")
>+                break
>+
>+            if self.push():
>+                # We did it! Hurray!
>+                self.info("Great success!")
>+                break

I suppose that works, rather than trying to rebase+retry the push itself over and over.
Do you want a max_retries?  Or a sleep between retries to avoid potentially hammering the servers?
Attachment #8344836 - Flags: feedback?(aki) → feedback+
Attached patch hg hook changeSplinter Review
The manifest bumper is going to be using a different username than the gaia pushbot.
Attachment #8359266 - Flags: review?(emorley)
Comment on attachment 8344836 [details] [diff] [review]
b2g bumper script

r+ IRL last week
Attachment #8344836 - Flags: review+
Attachment #8344836 - Flags: checked-in+
I assume we'll need this to land on non-trunk branches.
Attachment #8359275 - Flags: review?(aki)
Attachment #8359266 - Flags: review?(emorley) → review+
Attachment #8359275 - Flags: review?(aki) → review+
Attachment #8359266 - Flags: checked-in+
Attachment #8359275 - Flags: checked-in+
mostly copy/paste from gaia_bumper. Added gittool since that's required by the bumper.
Attachment #8359935 - Flags: review?(rail)
Attachment #8359935 - Flags: review?(rail) → review+
Depends on: 959753
Attachment #8361091 - Flags: review? → review+
Attachment #8361091 - Flags: checked-in+
Attachment #8359935 - Flags: checked-in+
From what I remember, gaia_bumper will eventually be removed?
Yup. I'll remove the manifests from puppet and clean up bm66 once everything is switched over.
Attachment #8361238 - Flags: review?(aki)
Attachment #8361239 - Flags: review?(aki)
Comment on attachment 8361239 [details] [diff] [review]
add wasabi to bumper

hgtool may still have issues finding hg unless PATH is set somewhere.
Attachment #8361239 - Flags: review?(aki) → review+
Attachment #8361238 - Attachment is obsolete: true
Attachment #8361238 - Flags: review?(aki)
Attachment #8361241 - Flags: review?(aki)
Attachment #8361241 - Flags: review?(aki) → review+
Attachment #8361239 - Flags: checked-in+
Attachment #8361241 - Flags: checked-in+
Depends on: 960696
haven't tested this doesn't break b2g builds...but it ShouldJustWork™!
Attachment #8361317 - Flags: review?(aki)
Comment on attachment 8361317 [details] [diff] [review]
use mapper to ensure gaia revisions are consistent

We at least have emulator builds on cypress pointing at mozharness default, so we have a way to test before this rolls out to production.
Attachment #8361317 - Flags: review?(aki) → review+
Attachment #8361317 - Flags: checked-in+
https://hg.mozilla.org/mozilla-central/rev/170b15394142
Assignee: nobody → catlee
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → B2G C4 (2jan on)
not quite done yet!
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
This strips projects with "darwin" in their list of groups, which means we stop cloning gcc binaries for OSX on our linux build slaves.
Attachment #8361690 - Flags: review?(aki)
Attachment #8361690 - Flags: review?(aki) → review+
Attachment #8361690 - Flags: checked-in+
Attachment #8362664 - Flags: review?(rail)
Comment on attachment 8362664 [details] [diff] [review]
make sure manifests end with a newline

lgtm
Attachment #8362664 - Flags: review?(rail) → review+
Attachment #8362664 - Flags: checked-in+
in production
The Nexus 4 builds on b2g-inbound are currently broken (bug 965173).
The manifest checkins do not list anything that could be the cause, so I'm presuming at least one of the repos isn't accounted for by the manifests?
Blocks: 965173
Blocks: 965174
Eugh missed it in the skim read earlier, think I read the next hunk as being the next file, so stopped reading after the gaia line change, since I didn't know they'd be so far apart as to need to be in two hunks for the same file (if that makes sense lol).

Guess bug 872628 would make this easier to spot.
No longer blocks: 965173
No longer blocks: 965174
Still TODO - other b2g release branches
(In reply to Chris AtLee [:catlee] from comment #41)
> Still TODO - other b2g release branches

Please update https://wiki.mozilla.org/ReleaseEngineering/Merge_Duty/Steps when this happens.  The steps for odd-numbered mozilla-central gecko versions will be different from even-numbered (in odd#, m-c will ride the trains to m-a; in even#, m-a will ride the trains to m-b2gXX_vX_X)
https://docs.google.com/a/mozilla.com/spreadsheet/ccc?key=0AmStZDZgJbV7dDhtMDZlQmRtdDB4a1plZXRwNXIzYWc#gid=0
Attachment #8379030 - Flags: review?(bhearsum)
Attachment #8379031 - Flags: review?(bhearsum)
Comment on attachment 8379031 [details] [diff] [review]
puppet changes to bump master + v1.3

Review of attachment 8379031 [details] [diff] [review]:
-----------------------------------------------------------------

Seems reasonable
Attachment #8379031 - Flags: review?(bhearsum) → review+
Attachment #8379030 - Flags: review?(bhearsum) → review+
Attachment #8379030 - Flags: checked-in+
something is in production
Attachment #8379031 - Flags: checked-in+
(In reply to Chris AtLee [:catlee] from comment #47)
> Created attachment 8379794 [details] [diff] [review]
> enable using in-tree manifests for mozilla-b2g26

err, mozilla-b2g28
Comment on attachment 8379794 [details] [diff] [review]
enable using in-tree manifests for mozilla-b2g28

Review of attachment 8379794 [details] [diff] [review]:
-----------------------------------------------------------------

Not for this bug, but we should probably remove the Unagi configs to avoid confusion.
Attachment #8379794 - Flags: review?(ryanvm) → review+
Attachment #8379794 - Attachment description: enable using in-tree manifests for mozilla-b2g26 → enable using in-tree manifests for mozilla-b2g28
Comment on attachment 8379794 [details] [diff] [review]
enable using in-tree manifests for mozilla-b2g28

Fixed up some trailing comma issues and landed

https://hg.mozilla.org/releases/mozilla-b2g28_v1_3/rev/d57a7f36eb72
Attachment #8379794 - Flags: checked-in+
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Blocks: 1028111
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.