Closed Bug 1194264 Opened 4 years ago Closed 4 years ago

Allow Mozilla CI tools to submit BBB TC graphs

Categories

(Testing :: General, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: armenzg, Assigned: armenzg)

References

Details

Attachments

(3 files)

This allows scheduling a build and dependent jobs which will get run once the build is finished.

We're going to use the Buildbot bridge to help us acomplish this.
I'm trying to build a taskcluster graph based on the code landed on alder (where the BBB bridge was used).

Attached can be found the full task graph for alder

From this sample command (gecko_decision task on Treeheder):
/mach taskcluster-graph --pushlog-id=29221 '--message= ' --project=mozilla-central --owner=ryanvm@gmail.com --revision-hash=bd56f90c65aaf1a85a2afdd118f4e1d2b0cb4d11 --extend-graph

I generated this one:
./mach taskcluster-graph --pushlog-id=42 '--message= ' --project=mozilla-central --owner=bhearsum@mozilla.com --revision-hash=886c982a13dc1c48b3f1ac0a9915b06c20628ea2 --extend-graph

How to calculate revision-hash?
https://treeherder.mozilla.org/api/project/mozilla-central/resultset/?revision=0cddd6a6565a
https://treeherder.mozilla.org/api/project/alder/resultset/?revision=81276b0f70d0

How to determine the push-id?
https://hg.mozilla.org/projects/alder/json-pushes?changeset=81276b0f70d0&tipsonly=1&version=2
bhearsum: how do we submit a task graph to TaskCluster from alder?

I don't see a gecko decision task in here:
https://treeherder.mozilla.org/#/jobs?repo=alder&revision=2c2bd69d40af&exclusion_profile=false

However, I can see the tasks submitted by the BBB:
https://secure.pub.build.mozilla.org/buildapi/self-serve/alder/build/78796134

>      "reason": "Created by BBB for task GtMVJjkaQSyTP15lbSHHow",

Is there a task graph somewhere? or are we submitting individual tasks?
I found the task-graph:
https://tools.taskcluster.net/task-graph-inspector/#74qxcTXsRBmFIlUdvX19jQ/

and the initial decision task:
https://tools.taskcluster.net/task-inspector/#XmFJ4oxJQHqHn75IpO8yNw/

It would be great if there was a gecko decision task on alder.
bhearsum: can I push the code on ash to try an expect the BBB work in a similar manner?
There is no gecko_decision task because we had an older decision task.
I'm still pushing the m-c merge. I hope it will finish in the next hour.

It was only submitting to allizom:
https://treeherder.allizom.org/#/jobs?repo=alder&revision=2c2bd69d40af&filter-searchStr=gecko

Different routes:
https://hg.mozilla.org/projects/alder/file/2c2bd69d40af/testing/taskcluster/tasks/decision/branch.yml#l19
https://hg.mozilla.org/mozilla-central/file/default/testing/taskcluster/tasks/decision/branch.yml
(In reply to Armen Zambrano Gasparnian [:armenzg] from comment #4)
> bhearsum: can I push the code on ash to try an expect the BBB work in a
> similar manner?

Yes, the bridge is enabled for all branches. It should pick up any jobs with provisionerId and workerType set to "buildbot-bridge".
Depends on: 1195973
I've submitted my first task graph directly through mozci:
http://docs.taskcluster.net/tools/task-graph-inspector/#m36eoTXkSpGNy9O3dg5Irg/_7cfTMnBQHuVl_6Jw586Qg

It is not working yet but I should have something by tomorrow.
bug 1195751 shows that we need 'product' to bet set.
See Also: → 1195751
bhearsum: how can I investigate if the task graphs which I submit have any issues and don't work for the buildbot bridge?

I'm now working on adding the treeherder scopes and I could compare with the graphs generated on alder, however, I would like to know if there is a way to validate before submission.
I've got the treeherder support working:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b0af66e75fdd
It needs some polishing though.
I've cleaned up the code and pushed it.
https://github.com/armenzg/mozilla_ci_tools/compare/master...mozci_bbb

This does not work well atm for Treeherder (builders mis-represented) and my try push being hardcode for TH's revision hash.

However, here's what the simple way of calling this would look like:
> python mozci/scripts/misc/buildbot_to_taskcluster.py --repo-name try --revision b0af66e75fdd 
> '{"Linux x86-64 try build": ["Ubuntu VM 12.04 x64 try opt test mochitest-2"]}' | taskcluster run-graph --verbose

Latest task-graph:
http://docs.taskcluster.net/tools/task-graph-inspector/#JlH0jnQaQC6WJTHjZzJC2w

I can see the buildbot job running here:
http://buildbot-master78.bb.releng.usw2.mozilla.com:8101/builders/Linux%20x86-64%20try%20build/builds/1559

To do (not all are blockers for a MVP):
#######################################
* Represent treeherder jobs appropriately (Windows builders under window grouping et al)
* Prevent TC tasks from rerunning if they fail
* Make sure that a build triggers a test job properly
* Make sure that the build does not trigger more tests than the one we indicated
* Detemine the revision_hash required for TH from the api for it
* Do not show test jobs on TH right away (bug 1196374)


[1] https://treeherder.mozilla.org/api/project/try/resultset/?revision=b0af66e75fdd
bhearsum: any ideas on how to make the test builder receive the installer_url and test_url?

https://treeherder.mozilla.org/#/jobs?repo=try&revision=82f79fee3429
http://docs.taskcluster.net/tools/task-graph-inspector/#1uORmauQTEme-RCXsRrV9g/0r9SYWpGSMmig-pmaZaIpA

http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/armenzg@mozilla.com-82f79fee3429/try-linux64/try_ubuntu64_vm_test-mochitest-2-bm115-tests1-linux64-build479.txt.gz
09:07:14     INFO - #####
09:07:14     INFO - ##### Running read-buildbot-config step.
09:07:14     INFO - #####
09:07:14     INFO - Running pre-action listener: _resource_record_pre_action
09:07:14     INFO - Running main action method: read_buildbot_config
09:07:14     INFO - Using buildbot properties:
09:07:14     INFO - {
09:07:14     INFO -     "properties": {
09:07:14     INFO -         "buildnumber": 479, 
09:07:14     INFO -         "product": "firefox", 
09:07:14     INFO -         "basedir": "/builds/slave/test", 
09:07:14     INFO -         "script_repo_revision": "production", 
09:07:14     INFO -         "branch": "try", 
09:07:14     INFO -         "repository": "", 
09:07:14     INFO -         "buildername": "Ubuntu VM 12.04 x64 try opt test mochitest-2", 
09:07:14     INFO -         "stage_platform": "linux64", 
09:07:14     INFO -         "who": "armenzg@mozilla.com", 
09:07:14     INFO -         "project": "", 
09:07:14     INFO -         "platform": "linux64", 
09:07:14     INFO -         "master": "http://buildbot-master115.bb.releng.usw2.mozilla.com:8201/", 
09:07:14     INFO -         "slavebuilddir": "test", 
09:07:14     INFO -         "taskId": "B1FekFUURUuyvdCpBcut8w", 
09:07:14     INFO -         "repo_path": "try", 
09:07:14     INFO -         "moz_repo_path": "", 
09:07:14     INFO -         "slavename": "tst-linux64-spot-1359", 
09:07:14     INFO -         "revision": "82f79fee3429"
09:07:14     INFO -     }, 
09:07:14     INFO -     "sourcestamp": {
09:07:14     INFO -         "repository": "", 
09:07:14     INFO -         "hasPatch": false, 
09:07:14     INFO -         "project": "", 
09:07:14     INFO -         "branch": "try", 
09:07:14     INFO -         "changes": [], 
09:07:14     INFO -         "revision": "82f79fee3429"
09:07:14     INFO -     }
09:07:14     INFO - }
09:07:14    ERROR - list index out of range
09:07:14    FATAL - Unable to set installer_url+test_url from the buildbot config!
09:07:14    FATAL - Running post_fatal callback...
09:07:14    FATAL - Exiting -1
09:07:14     INFO - Running post-action listener: _resource_record_post_action
09:07:14     INFO - Running post-run listener: _resource_record_post_run
09:07:14     INFO - Running post-run listener: _upload_blobber_files
09:07:14     INFO - Blob upload gear active.
09:07:14  WARNING - Blob upload directory does not exist!
bhearsum: IIRC when we talked there was a chance that dependent test jobs would not be able to download the installer and test urls since IIRC that path had not been tested on alder.

The test jobs when they read the buildbprop.json information, it lacks the 'changes' field.
This might be the place where it should be created:
https://github.com/mozilla/buildbot-bridge/blob/master/bbb/servicebase.py#L232

If this is not to work without some buildbot-bridge work, would you like us to consider an alternative approach?
I believe that if a dependent task knows which is the task which it dependent on, we could tell mozharness to query about any of the artifacts which the parent task has uploaded.

What do you think?

#######################3

On another note, it seems that the buildbot bridge waits for a build to complete before triggering any of the test jobs (instead of after the upload of the artifacts).

Is this something which is filed?
Flags: needinfo?(bhearsum)
(In reply to Armen Zambrano Gasparnian [:armenzg] from comment #14)
> bhearsum: IIRC when we talked there was a chance that dependent test jobs
> would not be able to download the installer and test urls since IIRC that
> path had not been tested on alder.
> 
> The test jobs when they read the buildbprop.json information, it lacks the
> 'changes' field.
> This might be the place where it should be created:
> https://github.com/mozilla/buildbot-bridge/blob/master/bbb/servicebase.
> py#L232
> 
> If this is not to work without some buildbot-bridge work, would you like us
> to consider an alternative approach?
> I believe that if a dependent task knows which is the task which it
> dependent on, we could tell mozharness to query about any of the artifacts
> which the parent task has uploaded.
> 
> What do you think?

I'm more than happy to give the bridge support for changes, if that's useful to you. If you'd rather do it in Mozharness that's fine too. Just let me know, and file a bug about support in the bridge if that's what you want.

> #######################3
> 
> On another note, it seems that the buildbot bridge waits for a build to
> complete before triggering any of the test jobs (instead of after the upload
> of the artifacts).

This is a limitation of how Taskcluster Task Graphs work, nothing to do with the bridge. There is no way to tell Taskcluster to start a downstream job prior to the upstream Task (or Tasks) being resolved. Jonas might have thoughts on whether or not that will change, but the bridge can't influence this. Let me know if you want a more detailed explanation of this.
Flags: needinfo?(bhearsum)
(In reply to Armen Zambrano Gasparnian [:armenzg] from comment #16)
> Created attachment 8650711 [details] [review]
> PR to add module to schedule buildbot jobs through TC
> 
> bhearsum, garndt, I'm interested on your feedback for
> * _createTask() [1]
> * generateTaskGraph() [2]
> 
> [1]
> https://github.com/armenzg/mozilla_ci_tools/pull/337/files#diff-
> 3e62fd7d392e6dd33a14da0de87c4348R16
> [2]
> https://github.com/armenzg/mozilla_ci_tools/pull/337/files#diff-
> 3e62fd7d392e6dd33a14da0de87c4348R105

Can you set up some context, please? Eg: what/where are the entry points? what will be calling this? I don't have a good grasp on MozCI to begin with, so I can throw generic comments about the code, but I'm not sure how to review the big picture without a bit of assistance...
(In reply to Ben Hearsum (:bhearsum) from comment #17)
> (In reply to Armen Zambrano Gasparnian [:armenzg] from comment #16)
> > Created attachment 8650711 [details] [review]
> > PR to add module to schedule buildbot jobs through TC
> > 
> > bhearsum, garndt, I'm interested on your feedback for
> > * _createTask() [1]
> > * generateTaskGraph() [2]
> > 
> > [1]
> > https://github.com/armenzg/mozilla_ci_tools/pull/337/files#diff-
> > 3e62fd7d392e6dd33a14da0de87c4348R16
> > [2]
> > https://github.com/armenzg/mozilla_ci_tools/pull/337/files#diff-
> > 3e62fd7d392e6dd33a14da0de87c4348R105
> 
> Can you set up some context, please? Eg: what/where are the entry points?
> what will be calling this? I don't have a good grasp on MozCI to begin with,
> so I can throw generic comments about the code, but I'm not sure how to
> review the big picture without a bit of assistance...

generate_task_graph() is the entry point. We mainly care about scheduling in one shot a build + dependent test jobs.
The only part that will be changing (not on this PR) is that instead of calling self-serve/buildapi's trigger abitrary job API, we will submit at once a graph to TaskCluster. The use case we're trying to improve is that in some cases we want to trigger a test job, however, we have no build to trigger it; in that case, we would submit a build request and monitor it until finished so we can then schedule the test job. With this functionality, we can submit build and test job at once and know that it will schedule the test job without further interaction.
Obviously there will be times when jobs might get canceled or fail but those are edge cases.
(In reply to Ben Hearsum (:bhearsum) from comment #15)
> (In reply to Armen Zambrano Gasparnian [:armenzg] from comment #14)
> > bhearsum: IIRC when we talked there was a chance that dependent test jobs
> > would not be able to download the installer and test urls since IIRC that
> > path had not been tested on alder.
> > 
> > The test jobs when they read the buildbprop.json information, it lacks the
> > 'changes' field.
> > This might be the place where it should be created:
> > https://github.com/mozilla/buildbot-bridge/blob/master/bbb/servicebase.
> > py#L232
> > 
> > If this is not to work without some buildbot-bridge work, would you like us
> > to consider an alternative approach?
> > I believe that if a dependent task knows which is the task which it
> > dependent on, we could tell mozharness to query about any of the artifacts
> > which the parent task has uploaded.
> > 
> > What do you think?
> 
> I'm more than happy to give the bridge support for changes, if that's useful
> to you. If you'd rather do it in Mozharness that's fine too. Just let me
> know, and file a bug about support in the bridge if that's what you want.
> 

If you can I would be more than happy! \o/

> > #######################3
> > 
> > On another note, it seems that the buildbot bridge waits for a build to
> > complete before triggering any of the test jobs (instead of after the upload
> > of the artifacts).
> 
> This is a limitation of how Taskcluster Task Graphs work, nothing to do with
> the bridge. There is no way to tell Taskcluster to start a downstream job
> prior to the upstream Task (or Tasks) being resolved. Jonas might have
> thoughts on whether or not that will change, but the bridge can't influence
> this. Let me know if you want a more detailed explanation of this.

That is perfectly fine. I will follow up with him.
Depends on: 1197204
Comment on attachment 8650711 [details] [review]
PR to add module to schedule buildbot jobs through TC

I have limited knowledge on how this would be used, but from the standpoint of creating a graph, it looks good to me.  Left a few comments in the PR
Attachment #8650711 - Flags: feedback?(garndt) → feedback+
Comment on attachment 8650711 [details] [review]
PR to add module to schedule buildbot jobs through TC

feedback+, as it seems like it will mostly work, but the structure might cause limitations at some point. I'll leave it up to you to determine whether or not it's important to address that up front.
Attachment #8650711 - Flags: feedback?(bhearsum) → feedback+
I've landed the PR after addressing some of the comments.
I'm moving the next set of actions to:
https://github.com/armenzg/mozilla_ci_tools/issues/339

The current blockers are bug 1197204 and being able to schedule graphs without piping to the taskcluster command.
Blocks: 1198341
Blocks: 1198430
Depends on: 1203085
No longer depends on: 1197204
Depends on: 1204077
Comment on attachment 8667436 [details] [review]
PR request to schedule graphs directly, dry_run support + single builder

adusca asked me to pass the review to someone with a bit more of TC background.
Attachment #8667436 - Flags: review?(alicescarpa) → review?(garndt)
Comment on attachment 8667436 [details] [review]
PR request to schedule graphs directly, dry_run support + single builder

changing this to a f+...looked over the schedule graph method seems ok.  Talked with armen on IRC about catching exceptions from the tc schedulegraph method.
Attachment #8667436 - Flags: review?(garndt) → feedback+
Depends on: 1194830
Depends on: 1212446
There's been a regression in BBB where we don't produce anymore a properties.json
We're either going to fix bug 1203085 or change the Buildbot builds to use the "target" filenames.
Blocks: 1194830
No longer depends on: 1194830
Summary: Allow Mozilla CI tools to schedule buildbot jobs through TaskCluster → Allow Mozilla CI tools to submit BBB TC graphs
I'm going to review the status of this.
No longer blocks: 1194830, 1198341
Blocks: 1220840
Duplicate of this bug: 1198360
Depends on: 1203552
We completed this a while ago.
We use TC/BBB scheduling when adding new jobs to pushes.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Comment on attachment 8650711 [details] [review]
PR to add module to schedule buildbot jobs through TC

>https://github.com/mozilla/mozilla_ci_tools/pull/337
Comment on attachment 8667436 [details] [review]
PR request to schedule graphs directly, dry_run support + single builder

>https://github.com/mozilla/mozilla_ci_tools/pull/349
You need to log in before you can comment on or make changes to this bug.