Closed Bug 550383 Opened 14 years ago Closed 14 years ago

Unittest sendchange should use explicit tests archive

Categories

(Release Engineering :: General, defect)

x86
Windows 7
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jhford, Assigned: jhford)

References

Details

Attachments

(1 file, 9 obsolete files)

25.13 KB, patch
catlee
: review+
jhford
: checked-in+
Details | Diff | Splinter Review
While testing unit tests on the windows talos ref image, I had trouble getting tar to run on the talos machines.  We already use unzip for extracting the symbols archive on talos machines.  This bug is to track the automation changes required to support using zips for unit tests.

An alternative to this would be to find a working tar+bzip2 that we can deploy to the talos machines.  I tried gnuwin32 and the cygwin gnutar + bzip.  Gnuwin32 bzip2 worked but neither their BSD tar nor Gnu tar worked.  I tried to copy the cygwin binary + cygwin1.dll but there were a lot of missing dependent libraries.

This would likely require a downtime.

I am willing to wrangle the patches/downtime after i have Maemo5 builds + n900 talos/unit tests but that might be a little while out.
Blocks: 548768
Depends on: 549427
How much work is there to do here?  UnittestPackagedBuildFactory uses UnpackFile, which uses the appropriate command to unpack depending on the filename you give it.
we currently derive the packaged tests archive through file name manipulation of the main firefox package name.  A short downtime would be the quickest way to get this into production, but requires us to have the branch patches land at the same time as the downtime, or else we would have to monitor for red tests caused by the tarball download failing and requeue them.  Alternatively, we could start sending the test archive name in the sendchange.
Assignee: nobody → jhford
Attached patch automation-zips.diff (obsolete) — Splinter Review
This patch does a hard cut over from .tar.bz2.  A deployment plan that allows us to not restart (I am pretty sure) is if we:
-close all branches
-land build-sys patch (will require someone who can land on the branches)
-wait for all recent checkins to start their builds using the factories that have .tar.bz2
-land automation patch & reconfig masters
-open tree again

Depending on how careful we wanted to be, we could do test builds before re-opening the trees.

Alternatively, if we are going to have a downtime soon, we could just land all these patches during that downtime.
Attachment #431782 - Flags: review?(catlee)
Attachment #431782 - Flags: review?(bhearsum)
Attachment #431782 - Attachment is obsolete: true
Attachment #431782 - Flags: review?(catlee)
Attachment #431782 - Flags: review?(bhearsum)
Comment on attachment 431782 [details] [diff] [review]
automation-zips.diff

new patch coming soon
instead of only changing what the package unit test build factory looks for by default, we should actively check the output of make upload for the explicit package location.  The default is also updated to be .tests.zip.

This patch modifies the script for getting properties from the output of make upload.  It also sends the tests package as the second file in the sendchange.
Attachment #432667 - Flags: review?(catlee)
Attachment #432667 - Flags: review?(bhearsum)
Comment on attachment 432667 [details] [diff] [review]
automation changes for zip file tests package

>diff --git a/process/factory.py b/process/factory.py
>--- a/process/factory.py
>+++ b/process/factory.py
>@@ -1272,31 +1272,33 @@
>                           '-p %s' % self.productName,
>                           '--release-to-tinderbox-dated-builds']
>         if self.nightly:
>             # If this is a nightly build also place them in the latest and
>             # dated directories in nightly/
>             postUploadCmd += ['-b %s' % self.branchName,
>                               '--release-to-latest',
>                               '--release-to-dated']
> 
>         uploadEnv['POST_UPLOAD_CMD'] = WithProperties(' '.join(postUploadCmd))
> 
>         def get_url(rc, stdout, stderr):
>+            retval = {}
>             for m in re.findall("^(http://.*?\.(?:tar\.bz2|dmg|zip))", "\n".join([stdout, stderr]), re.M):
>                 if m.endswith("crashreporter-symbols.zip"):
>-                    continue
>-                if m.endswith("tests.tar.bz2"):
>-                    continue
>-                return {'packageUrl': m}
>-            return {}
>+                    retval['symbolsUrl'] = m
>+                elif m.endswith("tests.tar.bz2") || m.endswith("tests.zip"):

|| isn't valid Python.  Please test your patches before submitting for review.
Attachment #432667 - Flags: review?(catlee) → review-
Attachment #432667 - Flags: review?(bhearsum)
Comment on attachment 432667 [details] [diff] [review]
automation changes for zip file tests package

Yes, please test before I review this
If you just need a win32 "tar" to avoid all this work, 7zip and UnxUtils are both pretty lightweight and free for commercial use.

1. UnxUtils tar.exe uses only msvcrt.dll. This project is well-known, and stable/old since 2003. The only troublesome bugs I see at http://sourceforge.net/tracker/?group_id=9328&atid=109328 are: holding locks on failure, and lowercasing files. I tested lowercasing, and do not see that bug.

2. 7zip is of course actively developed. The tar command is
  7z a dir.tar dir
(In reply to comment #8)
> If you just need a win32 "tar" to avoid all this work, 7zip and UnxUtils are
> both pretty lightweight and free for commercial use.
> 
> 1. UnxUtils tar.exe uses only msvcrt.dll. This project is well-known, and
> stable/old since 2003. The only troublesome bugs I see at
> http://sourceforge.net/tracker/?group_id=9328&atid=109328 are: holding locks on
> failure, and lowercasing files. I tested lowercasing, and do not see that bug.
> 
> 2. 7zip is of course actively developed. The tar command is
>   7z a dir.tar dir

Did you try either of these, John?
I will test this patch today.  If it works and gets r? Armen has offered to land this during tomorrow's downtime (thanks!)
Attachment #432667 - Attachment is obsolete: true
Comment on attachment 433363 [details] [diff] [review]
automation changes for zip file tests package

The only update is chaging || into 'or'.  That was a transcription error on my behalf (was working over vnc).  This patch also makes it realistic to make the third file in the sendchange be the symbolsUrl as we now get that property from the make upload step output.


master: localhost:9010
branch: mozilla-central-linux-opt-unittest
revision: 783ef71b479ee93d17ae3abcb0b909f346d129a9
comments: 
user: sendchange-unittest
files: ['http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1268947441/firefox-3.7a4pre.en-US.linux-i686.tar.bz2', 'http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1268947441/firefox-3.7a4pre.en-US.linux-i686.tests.tar.bz2']
Attachment #433363 - Flags: review?(catlee)
Comment on attachment 433363 [details] [diff] [review]
automation changes for zip file tests package

Please set your .hgrc to show function names.

FTR, this is in doUpload for NightlyBuildFactory
>         def get_url(rc, stdout, stderr):
>+            retval = {}
...
>-             files=[WithProperties('%(packageUrl)s')],
>+             files=[WithProperties('%(packageUrl)s'),
>+                    WithProperties('%(testsUrl)s'),
>+             ],
>              user="sendchange-unittest")

This looks fine to me, as do the exclusions in the release factories.

This is in __init__ for UnittestPackagedBuildFactory
>         def get_testURL(build):
...
>                     if fileURL.endswith(suffix):
>-                        testsURL = fileURL[:-len(suffix)] + '.tests.tar.bz2'
>+                        testsURL = fileURL[:-len(suffix)] + '.tests.zip'

The transition is a bit complicated so here's what I think will happen, assuming that the package naming change lands on m-c, 1.9.2, and 1.9.1, and _after_ the masters get reconfig'd:
 * m-c, project branches, and 1.9.2 will work because we start sendchanging the testsUrl
 * 1.9.1 will work, despite using an unmodified UnittestBuildFactory, because the default change in UnittestPackagedBuildFactory. Similar for the geriatric boxes
 * we will lose tests for the 3.5.9 and 3.6.2 releases if we need to respin, because the default is now wrong for those relbranches and there is no change to the sendchange in ReleaseBuildFactory's doUpload

Does that sound right ? If you also modify the sendchange in ReleaseBuildFactory then the default change would be OK.

I also think we have an obligation to communicate the change clearly with Thunderbird and SeaMonkey, because the package name change is going to cause them bustage.
Attachment #433363 - Flags: review?(nrthomas) → review-
this bug is moving in a direction that makes it possible to land before bug 549427 lands.  Changing summary to match this.
No longer depends on: 549427
Summary: Change automation to use zip files for tests archive → Unittest sendchange should use explicit tests archive
*release build factory and unittest build factory now gather properties and do send change properly as well

*default of .tests.tar.bz2 is left in tact.  this will become invalid for windows unittests once we turn off unittests on build machines

*i found a different of behavior between the first patch and this patch.  In the existing code, a default packageUrl of '' is used.  In my first version patch, i would return a dict without that key.  In the new patch I have matched the old behavior by defaulting the packageUrl key to '' before trying to set properties.

This patch has not been tested further and I don't have time to do the actual test.  This is in support of a Q1 goal and would be great if someone could test and land it if it works.
Attachment #433363 - Attachment is obsolete: true
Attachment #434387 - Flags: review?(nrthomas)
Comment on attachment 434387 [details] [diff] [review]
automation changes for zip file tests package

I"m not going to have a chance to look at this for a couple of days. Feel free to pass the r? and testing request on if someone else has cycles.
Comment on attachment 434387 [details] [diff] [review]
automation changes for zip file tests package

>diff --git a/process/factory.py b/process/factory.py
>@@ -4083,28 +4090,34 @@ class UnittestBuildFactory(MozillaBuildF
>             def get_url(rc, stdout, stderr):
>+                for m in ("^(http://.*?\.(tar\.bz2|dmg|zip))", "\n".join([stdout, stderr]), re.M):               

Not gonna work, and then we don't sendchange the testsUrl onwards.

>@@ -4794,24 +4807,25 @@ class L10nVerifyFactory(ReleaseFactory):
>          name='download_current_release',
>+                  '--exclude=*.zip',
>          name='download_previous_release',
>+                  '--exclude=*.tests.zip',

Testing another patch now.
Attachment #434387 - Flags: review?(nrthomas) → review-
Attached patch The Next Iteration (made it so!) (obsolete) — Splinter Review
Known to work with this:
* mozilla-central linux build, sendchanges, unit tests
* release linux build, sendchanges, unit tests
* confirmed the packaged test factory is really paying attention by giving bogus test urls in a commandline sendchange. Get a 404 trying to d/l test package

But there's a problem with the unit test build on 1.9.1 when it sendchanges. I was using windows, and the sendchange looked like this on the build side:
    master: localhost:9010
    branch: mozilla-1.9.1-win32-unittest
    revision: 75b15d431177
    comments: 
    user: sendchange-unittest
    files: ['http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-1.9.1-win32-unittest/1269853315/firefox-3.5.10pre.en-US.win32.zip', 'http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-1.9.1-win32-unittest/1269853315/firefox-3.5.10pre.en-US.win32.tests.tar.bz2']
It looks as expected - binary first, tests second. But the sendchange as reported on a unit test suite job has the file order reversed (ie tests first), which screws up the factory no end. Is there some "fixup" code somewhere doing this ? Conceivable that win32 would do this on mozilla-central to, haven't tested that.
(In reply to comment #18)
> It looks as expected - binary first, tests second. But the sendchange as
> reported on a unit test suite job has the file order reversed (ie tests first),
> which screws up the factory no end. Is there some "fixup" code somewhere doing
> this ? Conceivable that win32 would do this on mozilla-central to, haven't
> tested that.

If this is the case, we could start prefixing the file URL location with what the file is.  If order is not preserved, then we can't rely on it.  My idea is to do a sendchange with something like: 

['BROWSER@http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-1.9.1-win32-unittest/1269853315/firefox-3.5.10pre.en-US.win32.zip',
'TESTS@http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-1.9.1-win32-unittest/1269853315/firefox-3.5.10pre.en-US.win32.tests.tar.bz2']

This would be extra-explicit in telling the unittestpackagedbuildfactory where each file is and what it is.  My idea comes in the form of a patch that only deals with NightlyBuildFactory.  I have not done any testing on this patch though.

This would also be interesting for mobile testing as I would like to work on using this factory for doing on-device testing runs at some point in the future. (fennec browser tarballs are very different in name to xulrunner tests).
the relevant test-side changes are:

for the browser archive:
         def get_fileURL(build):
             fileURL = build.source.changes[-1].files[0]
-            return fileURL
+            if fileURL.startswith('BROWSER@'):
+                return fileURL[len('BROWSER@'):]
+            else:
+                return fileURL #this case may be broken on windows


             # use that as the test harness to download
             if len(build.source.changes[-1].files) > 1:
+                if testURL.startswith('TESTS@'):
+                    return testURL[len('TESTS@'):]
+                else:
+                    return testURL #This case might be broken on windows
                 testURL = build.source.changes[-1].files[1]
                 return testURL
             else:
I didn't update the testsUrl section properly.  The updated patch is attached (and overriding) the other one.

             if len(build.source.changes[-1].files) > 1:
+                if testURL.startswith('TESTS@'):
+                    return testURL[len('TESTS@'):]
+                else:
+                    return testURL #This case might be broken on windows
                 testURL = build.source.changes[-1].files[1]
                 return testURL

should read

             if len(build.source.changes[-1].files) > 1:
                 testURL = build.source.changes[-1].files[1]
+                if testURL.startswith('TESTS@'):
+                    return testURL[len('TESTS@'):]
+                else:
+                    return testURL #This case might be broken on windows
                 return testURL
Attachment #435633 - Attachment is obsolete: true
Urgh, looks like buildbot sorts the filenames on change objects, so .tests.tar.bz2 sorts before .zip.

The BROWSER@ TESTS@ prefixes would work, but could we look for an appropriate filename in each of the get_*URL functions?  We have to examine every file on the changes object anyway if there is more than one.
Comment on attachment 435634 [details] [diff] [review]
idea for explicit type/location sendchanges based on nick's updated patch

actually, this patch is not valid as it still makes assumptions about ordering on the test running side.  We would have to use the getUrl functions to iterate through the sendchange files list and only return things that start with.
(In reply to comment #22)
> Urgh, looks like buildbot sorts the filenames on change objects, so
> .tests.tar.bz2 sorts before .zip.
> 
> The BROWSER@ TESTS@ prefixes would work, but could we look for an appropriate
> filename in each of the get_*URL functions?  We have to examine every file on
> the changes object anyway if there is more than one.

yes, we do have to look at all the files, but unless download+unpack file steps are able to deal with multiple files, we would still only look at a set number of files.  In that case, we could look at all the files, and decide which one is most browser-like and which is most test-like, and use those.  Adding the prefix changes our dependence on ordering to dependence on having a prefix if there are multiple files.  This idea is for fixing the problem of file ordering, not having multiple test archives or multiple browser archives.  If we want to handle multiple files of each type we should file another bug?
(In reply to comment #24)
> (In reply to comment #22)
> > Urgh, looks like buildbot sorts the filenames on change objects, so
> > .tests.tar.bz2 sorts before .zip.
> > 
> > The BROWSER@ TESTS@ prefixes would work, but could we look for an appropriate
> > filename in each of the get_*URL functions?  We have to examine every file on
> > the changes object anyway if there is more than one.
> 
> yes, we do have to look at all the files, but unless download+unpack file steps
> are able to deal with multiple files, we would still only look at a set number
> of files.  In that case, we could look at all the files, and decide which one
> is most browser-like and which is most test-like, and use those.  Adding the
> prefix changes our dependence on ordering to dependence on having a prefix if
> there are multiple files.  This idea is for fixing the problem of file
> ordering, not having multiple test archives or multiple browser archives.  If
> we want to handle multiple files of each type we should file another bug?

Right.  All I'm saying is that if len(change.files) > 1, then you need to look at _all_ of them to decide which one interests you, since the ordering isn't preserved.

And since we're looking at the whole list of files, I'd rather keep the files as simple urls and pull out the appropriate one based on its name than add a prefix.
(In reply to comment #25)
> Right.  All I'm saying is that if len(change.files) > 1, then you need to look
> at _all_ of them to decide which one interests you, since the ordering isn't
> preserved.

Sure,  i guess i put in comment 23 before i saw comment 22.  If that is the case, we are saying the same thing I think and I apologize for misunderstanding your comments :)

> And since we're looking at the whole list of files, I'd rather keep the files
> as simple urls and pull out the appropriate one based on its name than add a
> prefix.

I think there is value in having the sendchange source decide which file is the browser, the tests and the symbols as the build factory understand's its artifacts more than the test running factory should.  This allows us to deal with situations like Fennec where the brower is something like fennec-1.1a1pre.....tar.bz2 and the tests are xulrunner-1.9.2.2pre-...tests.tar.bz2.  If we have the test running factory decide.  Even with the xulrunner/mobile case above, I guess we could just search for *.tests.* in each file and figure it out that way.
Attached file [wip] updated patch (obsolete) —
updated WIP patch
Attachment #434387 - Attachment is obsolete: true
Attachment #435564 - Attachment is obsolete: true
Attachment #435634 - Attachment is obsolete: true
Attached patch buildbotcustom changes (obsolete) — Splinter Review
update patch with build and test side implementation to allow for proper multi-file sendchanges.  This patch uses substring matching on the test running side to lower the performance impact, but regular expressions would work well too.  

The changes to the DownloadFile step are entirely optional.  The reason I made them is to deal with exceptions by putting errors in the log.  I think not handling the exception is the way to go, especially with the exceptions being mailed to us.  

I have tested that this still works with a single file sendchange as well as mutli file send changes (2 and 3 files) and all situations work.  I have only added the tests archive to the sendchange because we don't always generate symbols.  I could generate the list outside of the SendChangeStep, but that isn't what I have tested and isn't needed for this to land properly.
Attachment #437124 - Attachment is obsolete: true
Attachment #437377 - Flags: review?(catlee)
Comment on attachment 437377 [details] [diff] [review]
buildbotcustom changes

>+            retval = {'packageUrl': ''}

I think this should default to returning {}.  Same with all the other instances
of this code.

Maybe refactor out all the get_url functions into one if they're all the same?

>+        def get_url(build, include_substr='', exclude_substrs=[]):
>+            '''Given a build object, figure out which files have the include_substr
>+            in them, then exclude files that have one of the exclude_substrs. This
>+            function uses substring pattern matching instead of regular expressions
>+            as it meets the need without incurring as much overhead.'''
>+            potential_files=[]
>+            for file in build.source.changes[-1].files:
>+                if include_substr in file:
>+                    potential_files.append(file)
>+            #Not sure if I should assert or use an if ...: raise ValueError as was 
>+            #done in get_*URL functions
>+            #if len(potential_files) is 0:
>+            #    raise ValueError("No files in sendchange are valid")

Please remove this if it's going to be commented out.

>+            assert len(potential_files) > 0, 'No files in unittest sendchange are valid'
>+            for substring in exclude_substrs:
>+                for f in potential_files:
>+                    if substring in f:
>+                        while f in potential_files: #this check might be overkill
>+                            potential_files.remove(f)
>+            for file in potential_files[1:]: #dupes aren't ambiguous
>+                if file == potential_files[0]:
>+                    potential_files.remove(file)
>+            assert len(potential_files) == 1, 'Ambiguous unittest sendchange!'
>+            #if len(potential_files) is not 1:
>+            #    raise ValueError("Ambiguous sendchange file list!")

Remove these lines too.

I think this function could be simplified a bunch if you prevented duplicates
from being inserted into potential_files in the first place.

>         else:
>-            self.setCommand(["wget"] + self.wget_args + ["-N", url])
>+            self.setCommand(['echo', 'AutomationError: %s'% str(self.failure_reason)])

Can we do something like SendChagneStep does here and return
self.finished(FAILURE) after adding an errormessage?

>+    def createSummary(self, log):
>+        #It might be useful to parse logs for things like timeouts, dns failure, etc
>+        if self.failure_reason:
>+            self.addCompleteLog('automation-failure',
>+                ''.join(('AutomationError: ', str(self.failure_reason))))
>+

What purpose does this additional log serve?
Attachment #437377 - Flags: review?(catlee) → review-
(In reply to comment #30)
> (From update of attachment 437377 [details] [diff] [review])
> >+            retval = {'packageUrl': ''}
> 
> I think this should default to returning {}.  Same with all the other instances
> of this code.

done

> 
> Maybe refactor out all the get_url functions into one if they're all the same?

done.  it is now a global function in the factory module.

> 
> Please remove this if it's going to be commented out.

done
 
> I think this function could be simplified a bunch if you prevented duplicates
> from being inserted into potential_files in the first place.

done
 
> Can we do something like SendChagneStep does here and return
> self.finished(FAILURE) after adding an errormessage?

That's what I was trying to do, but didn't know how.  I like this better :)

This patch has been tested.  I did touch the CC* factories, so I should probably get review from the CC* users.  Who would I ask for review?
Attachment #437377 - Attachment is obsolete: true
Attachment #437496 - Flags: review?(catlee)
Comment on attachment 437496 [details] [diff] [review]
automation changes for zip file tests package updated

All good, except:

>-        url = self.url_fn(self.build)
>+        try:
>+            url = self.url_fn(self.build)
>+        except Exception, e:
>+            self.addCompleteLog("errors", str(e))
>+            return self.finished(FAILURE)
>+

Could you do something like "Automation Error: %s" % str(e) instead?
Attachment #437496 - Flags: review?(catlee) → review+
(In reply to comment #32)
> (From update of attachment 437496 [details] [diff] [review])
> All good, except:
> 
> >-        url = self.url_fn(self.build)
> >+        try:
> >+            url = self.url_fn(self.build)
> >+        except Exception, e:
> >+            self.addCompleteLog("errors", str(e))
> >+            return self.finished(FAILURE)
> >+
> 
> Could you do something like "Automation Error: %s" % str(e) instead?

yes, will do
Comment on attachment 437496 [details] [diff] [review]
automation changes for zip file tests package updated

http://hg.mozilla.org/build/buildbotcustom/rev/e02144446979

Checked in with the Automation Error: prefix
Attachment #437496 - Flags: checked-in+
pm/pm02 reconfiged.
This has worked.  There were issues when we tried to switch to tests.zip so that was backed out and we are back to tests.tar.bz2
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
No longer blocks: 548768
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: