Closed Bug 661585 Opened 11 years ago Closed 9 years ago

[tracking bug] Analyze and optimize setup time of unit test and talos jobs

Categories

(Release Engineering :: General, defect, P4)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: armenzg, Assigned: armenzg)

References

Details

(Whiteboard: [buildfaster:p2])

This takes care of reducing setup time.

See bug 659328 for reducing reboots and saving setup time.
OK here is the list of builders that take more than 4mins on setting up (minus tegras).
I will analyze each one of them one by one and see where we get to.
As seen on the spreadsheet, setup time is very variable considering on cleaning previous runs and downloading builds and others (there is caching to help).

try_xp_test-tp4 
try_win7_test-tp4 
try_win7-debug_test-mochitest-other 
try_win7_test-scroll 
try_xp_test-chrome 
try_xp_test-mochitest-other 
try_win7_test-mochitest-other 
try_xp-debug_test-mochitest-other 
try_win7_test-jsreftest 
try_win7-debug_test-jsreftest 
try_win7_test-chrome 
try_leopard_test-dirty 
try_snowleopard_test-dirty 
try_win7_test-nochrome 
try_snowleopard_test-paint 
try_xp_test-jsreftest 
try_snowleopard_test-tp4 
try_win7-debug_test-reftest 
try_win7-debug_test-crashtest 
try_snowleopard_test-mochitests-2 
try_snowleopard_test-mochitest-other 
try_snowleopard_test-mochitests-3 
try_xp-debug_test-reftest 
try_xp_test-crashtest 
try_snowleopard_test-scroll 
try_leopard_test-jsreftest 
try_win7_test-dirty 
try_leopard_test-mochitest-other
We should not have to download/unpack pagesets.zip (~3mins on XP) since it doesn't really change.
> wget --progress=dot:mega -N http://url.com/to/pageset.zip

On snowleopard unpacking the build takes a long time (~1min - less than 5 secs on Fedora).
Perhaps we could upload a tar ball instead of a dmg?
Or try to optimize "bash ../tools/buildfarm/utils/installdmg.sh firefox-6.0a1.en-US.mac64.dmg"
Note that we use two different ways of unpacking dmg giles:
Talos jobs use ~/talos-slave/talos-data/installdmg.sh while unit tests use ../tools/buildfarm/utils/installdmg.sh

For mochitest-other we run the following command several times:
> unzip -o firefox-7.0a1.en-US.linux-i686.tests.zip 'bin*' 'certs*' 'mochitest*'
Not that I see that it takes a long time (The first time takes > 10secs & the following 2 secs).

Cloning tools takes 20 to 40 secs.

chmod_files on windows 7 (not xp) takes a lot of time (bug 544727):
> chmod_files chmod files (see msys bug) ( 1 mins, 58 secs ) 
The same for "removing any old dir"
> remove any old working dirs remove old working dirs ( 1 mins, 53 secs )

We should also look to see if we can improve download time (from staging to scl1) as in some case can take a long time:
> Download build download ( 1 mins, 18 secs ) 

We should also look into avoid downloading symbols if not necessarily.

We should only download talos.zip if newer as it does not get refreshed so often:
> wget --progress=dot:mega -N http://build.mozilla.org/talos/zips/talos.zip

It seems that there are a bunch of places for things to be improved.

I wonder if we should put all the efforts in bug 650880 and bug 650887 rather than fixing each issue individually.
(In reply to comment #2)
> We should not have to download/unpack pagesets.zip (~3mins on XP) since it
> doesn't really change.
> > wget --progress=dot:mega -N http://url.com/to/pageset.zip

Not downloading the pageset was a decision made because of how difficult it was to update it. Now that we have Puppet on the POSIX machines, and OPSI on XP I think keeping it on the machine is the right call.

> On snowleopard unpacking the build takes a long time (~1min - less than 5
> secs on Fedora).
> Perhaps we could upload a tar ball instead of a dmg?

I'm not a fan of this. I think we should be testing the exact things that we ship to users.
Depends on: 561235
Depends on: 661649
Depends on: 661656
Priority: -- → P3
(In reply to comment #2)
> On snowleopard unpacking the build takes a long time (~1min - less than 5
> secs on Fedora).
> Perhaps we could upload a tar ball instead of a dmg?
> Or try to optimize "bash ../tools/buildfarm/utils/installdmg.sh
> firefox-6.0a1.en-US.mac64.dmg"
> Note that we use two different ways of unpacking dmg giles:
> Talos jobs use ~/talos-slave/talos-data/installdmg.sh while unit tests use
> ../tools/buildfarm/utils/installdmg.sh
Where is the talos installdmg.sh found?  I can't seem to find it to see if there is a difference between them.  Do we see a discrepancy between the time it takes to unpack a dmg on a talos run versus a unittest run?

I'm also with bhearsum, I don't think we want to shift to a tar on mac.  We want to stay with dmg.

The buildfarm installdmg.sh has a 5 second sleep in it.  That could be eliminated because there really isn't any point in waiting for the disk-image helpers to go away.  They don't hurt anything, they just disappear.  And the rest of the code looks correct in that you trap and ensure you call detach even if there's an error (this is very good, because too many diskimage-helpers destroy a machine's resources).
Depends on: 590969
Depends on: 659328
Summary: Analyze and optimize setup time of unit test and talos jobs → [tracking bug] Analyze and optimize setup time of unit test and talos jobs
Priority: P3 → P4
Depends on: 586418
Depends on: 583129
Depends on: 595237
Status: NEW → ASSIGNED
OS: Mac OS X → All
Hardware: x86 → All
Whiteboard: [buildfaster:pN]
Whiteboard: [buildfaster:pN] → [buildfaster:p2]
After bug 661656 got resolved we have faster transfer between stage.m.o and the test slaves on the scl colo.
Priority: P4 → P2
Priority: P2 → P3
Priority: P3 → P4
Nothing left on this specific bug. Other bugs have been opened for similar issues.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.