Closed Bug 1822403 Opened 2 years ago Closed 2 years ago

Support Firefox Translations in Taskcluster

Categories

(Firefox Build System :: Task Configuration, task)

task

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ahal, Assigned: bhearsum)

References

Details

Attachments

(15 files, 2 obsolete files)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

This will involve setting up Taskcluster with https://github.com/mozilla/firefox-translations-training, as well as creating a new GPU enabled pool for training the machine learning model on.

Pushed by ahalberstadt@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/59af005c9d32 Enable Taskcluster support for 'firefox-translations-training', r=releng-reviewers,bhearsum
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/a28863cf4fdd Enable Taskcluster support for 'firefox-translations-training', r=releng-reviewers,bhearsum
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/df2fc75570cf adjust trust domian for firefox translations project. r=releng-reviewers,gbrown
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/d1af5c067b0b add worker pools for translations trust domain. r=releng-reviewers,jcristau
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/e1af4baa291b add ci-config for firefox translations staging repo r=releng-reviewers,jcristau

Some of the toolchains we're building are large enough to warrant this already -- and we'll probably end up using these for some of the more CPU intensive parts of the training pipeline.

Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/705a9c5f84c0 add b-linux-large-gcp workers for translations trust domain r=releng-reviewers,ahal
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/a27dba81cf7f add treeherder reporting to firefox translations repos r=releng-reviewers,gbrown
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/b41c926fa820 generate action hooks for translations projects. r=releng-reviewers,ahal
Assignee: ahal → bhearsum

We're probably going to push my initial work to the main repo soon, and my hope is that we'll be able to further iterate in PRs. Given this, it will be important that PRs can't stomp on caches from on-push or action tasks.

Attachment #9330407 - Attachment description: Bug 1822403: raise main translations repo to level 3 r?#releng-reviewers! → Bug 1822403: raise firefox-translations-training repo to level 3 r?#releng-reviewers!
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/b192402d7565 raise firefox-translations-training repo to level 3 r=releng-reviewers,jcristau

I'll add GPU workers after we stabilize them on level 1.

Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/489125cae2b8 add worker pools for non-gpu translations level 3 workers r=releng-reviewers,gbrown,jcristau
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/7d5ed3aea4e2 revert translations GPUworker patch because of issues creating the new provider. r=releng-reviewers,gabriel

(In reply to Pulsebot from comment #22)

Pushed by bhearsum@mozilla.com:
https://hg.mozilla.org/ci/ci-configuration/rev/7d5ed3aea4e2
revert translations GPUworker patch because of issues creating the new
provider. r=releng-reviewers,gabriel

I backed this out due to this error when deploying:

Error: Identity and Access Management (IAM) API has not been used in project 559515877712 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/iam.googleapis.com/overview?project=559515877712 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.
    at Gaxios._request (/app/node_modules/gaxios/build/src/gaxios.js:129:23)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async JWT.requestAsync (/app/node_modules/google-auth-library/build/src/auth/oauth2client.js:368:18)
    at async GoogleProvider.setup (/app/services/worker-manager/src/providers/google.js:84:35)
    at async Providers.setupProvider (/app/services/worker-manager/src/providers/index.js:75:7)
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/93229a318e55 properly backout GPU workers. r=releng-reviewers,hneiva,jcristau
Attachment #9331009 - Attachment is obsolete: true
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/7a7d1880dd3d add level 3 hooks for firefox translations. r=releng-reviewers,gbrown

I see we usually only grant specific hooks, but in this case I think all are appropriate, as it would be good to allow him to cancel & rerun tasks as well.

Attachment #9335664 - Attachment description: WIP: Bug 1822403: grant anatal access to fire translations hooks → Bug 1822403: grant anatal access to fire translations hooks r?#releng-reviewers!
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/04e53fbf19aa grant anatal access to fire translations hooks r=releng-reviewers,gbrown

Now that we have 2 people working on this, and multiple training tasks, we bump up against the max of 2 often.

Depends on D180375

Two reasons for doing this:

  1. To make sure multiple GPUs work with our Taskcluster pipeline
  2. To speed up development :)

Depends on D180376

Attachment #9338155 - Attachment is obsolete: true
Pushed by bhearsum@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/e31a2f8fdaa9 Bump maxCapacity for translations GPU worker r=gabriel https://hg.mozilla.org/ci/ci-configuration/rev/26656585b688 Add translations workers with more than 1 GPU r=gabriel
Blocks: 1844556

Our initial port of the pipeline is now completed and working in Taskcluster. There will definitely still be some follow-up fixes and improvements needed. I've filed bug 1844556 to track those in a central place.

Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: