Closed Bug 552723 Opened 14 years ago Closed 14 years ago

Remove L10N_SLAVES cap and use pick most recent slave plus a MAX_NUMBER of slaves doing repacks

Categories

(Release Engineering :: General, defect, P4)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 539588

People

(Reporter: armenzg, Unassigned)

Details

(Whiteboard: [l10n])

Windows repacks are taking too long.
Around 20 minutes on fresh checkout and 11 minutes after having done a repack on that slave (This is totally related to the network issues we are currently having).

Nevertheless, it will be worth getting some numbers on how long it normally takes and think of some steps that could be shortened or scrap some seconds from them.
Another thought is to use a larger number of slaves than L10N_SLAVES so we get more slaves working than usual nightly repacks.
(In reply to comment #0)
> Windows repacks are taking too long.
> Around 20 minutes on fresh checkout and 11 minutes after having done a repack
> on that slave (This is totally related to the network issues we are currently
> having).

These slaves are not in Castro, therefore, they should not be affected by the network issues we are having. This means that they are really this slow.

Find below the times for a release repack (the times below are of a repack on a slave where a first repack have already happened). The total time was 8mins and 58secs:
NOTE: I am just tracking times for reference.
* we always clobber build tools ~7-10 secs
* get_enUS ~5secs
* get_locale_src ~12secs
* we always clobber buildbot-configs ~7-10 secs
* configure - 1m 38secs
* we always clobber compare-locales ~2secs

Steps that a locale needs always run:
* compare-locale < 2secs
* make installers - 3m 23secs
* make upload ~ 3m 

Even if we came up with a solution to avoid running the un-needed steps we would only be saving something between 2mins and 4mins.

I believe the best solution is that for releases we use a larger set of slaves
than L10N_SLAVES which is used by nightly repacks.

NOTE: The times are extracted from looking at:
http://production-master.build.mozilla.org:8010/builders/win32_repack/builds/2023
(In reply to comment #2)
> Even if we came up with a solution to avoid running the un-needed steps we
> would only be saving something between 2mins and 4mins.

Yeah, this is probably a pretty easy way to speed us up.

> I believe the best solution is that for releases we use a larger set of slaves
> than L10N_SLAVES which is used by nightly repacks.

I don't think we necessarily win by doing this. According to your numbers, we sink 9 minutes into the first build on a slave.

It seems that our main problem here is that we wait, sometimes a long time, for slaves to become free to do repacks. Maybe the time has come to get rid of L10N_SLAVES altogether and instead, set a cap on how many slaves can do l10n on a given platform, at a time. Combined with Catlee's "pick the most recent slave" optimization this could allow us to get rid of waiting for slave times with no downside.
(In reply to comment #3)
> It seems that our main problem here is that we wait, sometimes a long time, for
> slaves to become free to do repacks. Maybe the time has come to get rid of
> L10N_SLAVES altogether and instead, set a cap on how many slaves can do l10n on
> a given platform, at a time. Combined with Catlee's "pick the most recent
> slave" optimization this could allow us to get rid of waiting for slave times
> with no downside.

This sounds good. I would also love to see the ix machines picking these jobs up.
Summary: Speed up Windows repacks for releases → Remove L10N_SLAVES cap and use pick most recent slave plus a MAX_NUMBER of slaves doing repacks
Fixed by bug 539588 (attachment 450233 [details] [diff] [review]) I believe.  We still have a cap, but it's based on which slaves are currently connected.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.