Closed Bug 589251 Opened 14 years ago Closed 13 years ago

Parallelize update_verify

Categories

(Release Engineering :: General, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Assigned: bhearsum)

References

Details

(Whiteboard: [release][automation])

Attachments

(2 files, 5 obsolete files)

update_verify runs as a single job per platform.  We could speed this up by splitting it up into different jobs per locale.
Assignee: nobody → bhearsum
Blocks: 408453
I have a few ideas about how to do this, only one of them seems realistic, though:
- Stick with 1 Builder per platform and create a Scheduler that creates one, unmergeable BuildSet for each previous version, per platform. Doing this would make it impossible to detect doneness, and is generally dirty.
- Create 1 Builder per platform per previous version. We'd be able to detect doneness through a Dependent Scheduler. This would require that release.py is aware of the current version and previous versions, which is reconfig-dependent, and thus bad.

The only other idea I came up with is mochitest-style chunking. Doing this would give us a fixed number of update verify builders per platform, which means we can detect doneness, and it doesn't require release.py to know anything about the current release, so we're not reconfig dependent.

Within the last idea, there's a couple variations. The simplest route is to just split the testing by version. For example, if we have 3 chunks and 4 previous versions, chunk #1 would test 2 of them, and chunks 2 and 3 would only test one. This could cause chunk #1 to take significantly longer than the others.

The slightly more complicated route is to split by version and locales. This would let us give all chunks more or less the same amount of work, and lower the overall time it takes to run update verify. I'm going to shoot for this unless it becomes more pain than its worth.
Working on this here: https://github.com/bhearsum/tools/commits/parallel-verify/
Priority: P5 → P2
This patch does a few things:
- Puts chunking logic in a generic function
- Adds an UpdateVerifyConfig class, capable of reading, writing, and generating chunks of a verify config.
- Adds a basic update verify wrapping script that does chunking

I spent a decent amount of time on UpdateVerifyConfig in hopes that it will be useful elsewhere in the future. It should be easier to write additional update verify tests in the future, for example (eg, to verify that PPC users don't get updates).

I've done some basic testing on this to verify that the correct config gets spit out -- at least on Linux.

Still needs more robust testing, and buildbotcustom integration.
Attachment #522665 - Flags: feedback?(catlee)
New in this version (bugfixes excluded):
- Use newly invented _RUNTIME tag to clone tools repo for update verify configs
- Write wrapper for python script

I explicitly decided not to run clobberer or purge builds because they aren't required AFAICT. Happy to add them back in if folks feel it's important.

This version has been through Buildbot on mac, windows, and 32-bit Linux, on the 1.9.2 branch.
Attachment #522665 - Attachment is obsolete: true
Attachment #523090 - Flags: feedback?(rail)
Attachment #523090 - Flags: feedback?(nrthomas)
Attachment #523090 - Flags: feedback?(catlee)
Attachment #522665 - Flags: feedback?(catlee)
Attached patch chunked verify, v3 (obsolete) — Splinter Review
This patch has been tested for normal update verify and major update cases across all 1.9.2 platforms. Changes since the last patch:
- Drop reclone of tools, because script_repo_revision is now set to _RUNTIME tag
- Grab release tag from different property, also because of the above
- Use mkstemp instead of NamedTemporaryFile because the latter doesn't support delete=False on versions we care about
Attachment #523090 - Attachment is obsolete: true
Attachment #523644 - Flags: review?(nrthomas)
Attachment #523644 - Flags: review?(catlee)
Attachment #523090 - Flags: feedback?(rail)
Attachment #523090 - Flags: feedback?(nrthomas)
Attachment #523090 - Flags: feedback?(catlee)
- Tag tools repo with _RUNTIME tags during updates.
- Generalize chunking default to DEFAULT_PARALLELIZATION; l10n and update verify chunks are independently controllable through release config overrides
- Generalize function that creates ||ized builder names
- Use ScriptFactory to run update verify / major update
Attachment #523672 - Flags: review?(nrthomas)
Attachment #523672 - Flags: review?(catlee)
Comment on attachment 523644 [details] [diff] [review]
chunked verify, v3

Obsolete this because it needs a small bit of change to actually work with bug 408453.
Attachment #523644 - Attachment is obsolete: true
Attachment #523644 - Flags: review?(nrthomas)
Attachment #523644 - Flags: review?(catlee)
Attached patch v4 (obsolete) — Splinter Review
This version is the same as before, but uses build_id + from_path to identify a release, to make bug 408453 work. Adds a couple more tests, too.
Attachment #526088 - Flags: review?(nrthomas)
Attachment #526088 - Flags: review?(catlee)
Comment on attachment 523672 [details] [diff] [review]
buildbotcustom integration

This looked fine from 20000ft.
Attachment #523672 - Flags: review?(nrthomas) → review+
Comment on attachment 526088 [details] [diff] [review]
v4

Sorry Ben, I won't be able to look at this anytime soon.
Attachment #526088 - Flags: review?(nrthomas)
Attached patch v5, fix "None" vs. None bug (obsolete) — Splinter Review
I found a bug in my final tests of this patch that caused a Traceback in any case where "from" was set to None.
Attachment #526088 - Attachment is obsolete: true
Attachment #531699 - Flags: review?(catlee)
Attachment #526088 - Flags: review?(catlee)
Comment on attachment 531699 [details] [diff] [review]
v5, fix "None" vs. None bug

Damn, found another bug
Attachment #531699 - Attachment is obsolete: true
Attachment #531699 - Flags: review?(catlee)
The following bugs have been fixed since the last version:
- Make chunking work with filenames with spaces
- Greatly improve the chunking algorithm to make chunks more even in terms of work.

I did full update verify runs on the latest 1.9.1, 1.9.2, and 2.0. Most chunks were green, with the red ones fully explainable:
- 1.9.1 Mac, 3.5.18 - kn, ml, and te updates receiving updates to 3.5.19build1 (tracked here: https://bugzilla.mozilla.org/show_bug.cgi?id=629256#c47)
- 1.9.2 - All OK
- 2.0 - en-US "failed" because they point to 5.0b1 on betatest, that's OK because that's where they're supposed to point.
Attachment #532955 - Flags: review?(catlee)
Attachment #523672 - Flags: review?(catlee) → review+
Attachment #532955 - Flags: review?(catlee) → review+
Comment on attachment 523672 [details] [diff] [review]
buildbotcustom integration

Landed on default of buildbotcustom: changeset:   1550:e00aed00ca16
Attachment #523672 - Flags: checked-in+
Comment on attachment 532955 [details] [diff] [review]
v6, more bugs fixed

Landed: changeset:   1447:edac506701d0
Attachment #532955 - Flags: checked-in+
This hit production.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: