Closed Bug 1765658 Opened 2 years ago Closed 2 years ago

[meta] allow for testing fxci tasks on the staging cluster

Categories

(Release Engineering :: General, defect)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mozilla, Assigned: mozilla)

References

Details

Attachments

(1 file)

52 bytes, text/x-github-pull-request
Details | Review

We had this, but we've regressed.
Meta bug to track the fixes needed.

Blocks: 1765661
No longer blocks: 1765661
Depends on: 1765661
Blocks: 1765662
Severity: -- → S3
Assignee: nobody → aki
  1. run remove_secrets_from_staging.py to clear out the previous secrets I populated (with fake or non-sensitive values)
  2. graft this patch onto central, push to try via ./mach try release --migration central-to-beta -v 102.0b1
  3. we don't need the production try tasks, so cancel the graph via taskcluster group cancel -- TASK_GROUP_ID or the cancel-all action to save compute cycles. Use tc-relduty for the former, or sign into Treeherder or TC UI for the latter.
  4. copy the try revision; you'll need this for the next step
  5. Use fxci to send that task to the staging cluster:
# Set the revision to your try push revision
REVISION=12345
tc-staging  # using the aliases in https://docs.mozilla-releng.net/en/latest/taskcluster/taskcluster_cli.html#aliases to set your TASKCLUSTER_ROOT_URL to staging
eval $(taskcluster signin --expires 15m)
fxci replay-hg-push try $REVISION
# This will output a url like `https://stage.taskcluster.nonprod.cloudops.mozgcp.net/tasks/J9WeztDYT4aQstuJUGOgIg`.
  1. This is an on-push task that will trigger a decision task; look at the logs to find the decision task url/taskId. This will look like:
    ],
    "schedulerId": "gecko-level-1",
    "scopes": [
        "assume:repo:hg.mozilla.org/try:branch:default",
        "queue:route:notify.email.asasaki@mozilla.com.*",
        "in-tree:hook-action:project-gecko/in-tree-action-1-*",
        "index:insert-task:gecko.v2.try.*"
    ],
    "tags": {
        "createdForUser": "asasaki@mozilla.com",
        "kind": "decision-task"
    },
    "taskGroupId": "a91geUF6Tlm7hT1aB41CXA",
    "workerType": "decision"
}
Task Id: a91geUF6Tlm7hT1aB41CXA
[taskcluster 2022-05-10 20:14:34.072Z] === Task Finished ===
[taskcluster 2022-05-10 20:14:34.073Z] Successful task run with exit code: 0 completed in 14.569 seconds

In the above case, we want a91geUF6Tlm7hT1aB41CXA; go to https://stage.taskcluster.nonprod.cloudops.mozgcp.net/tasks/a91geUF6Tlm7hT1aB41CXA in that case.

This will schedule a level 1 staging taskGroup for a staging release. We expect:

  • ideally a green decision task
  • all scriptworker tasks will hang and hit the exception deadline-exceeded at some point, because there are no scriptworkers pointed at the staging cluster to claim those tasks
  • same with the hardware pools

We at least want to see a good number of docker-worker tasks going green.
This is happening now, so I'll go ahead and get the above into the docs.

Attached file docs PR

This should be done once we land the above PR.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: