Closed
Bug 962011
Opened 10 years ago
Closed 10 years ago
l10n testruns don't run on mac
Categories
(Mozilla QA Graveyard :: Infrastructure, defect, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: cosmin-malutan, Assigned: cosmin-malutan)
Details
l10n testruns don't run on mac because the downloaded build gets the "$DATE" appended and this will determine the testrun script to treat the path to the build as an empty variable. I checked the mm-osx-108-1 node, where for example under the mozilla-aurora_l10n/builds we have "$DATE-mozilla-aurora-firefox-28.0a2.hu.mac.dmg" When I tryed to download the build manually with mozdownload and I replaced all the parameters except the $DATE, it worked fine, it didn't appended the "$DATE" to the name of the build.
Assignee | ||
Comment 1•10 years ago
|
||
Here is a log from jenkins:
> 97% |############################################ | ETA: 00:00:00 28.80 M/s
> 98% |############################################# | ETA: 00:00:00 28.85 M/s
> 99% |############################################# | ETA: 00:00:00 28.85 M/s
>100% |##############################################| Time: 00:00:02 28.99 M/s
>11:28:52 [mozilla-aurora_l10n] $ mozmill-env-$ENV_PLATFORM/run testrun_l10n --repository=mozmill-tests --junit=report.xml --workspace=data --report=$REPORT_URL builds/
>11:28:54 hdiutil: attach failed - No such file or directory
>11:28:54 *** Installing build: /Users/mozauto/jenkins/workspace/mozilla-aurora_l10n/builds/$DATE-mozilla-aurora-firefox-28.0a2.ru.mac.dmg
>11:28:54 Traceback (most recent call last):
>11:28:54 File "/Users/mozauto/jenkins/workspace/mozilla-aurora_l10n/mozmill-env-mac/bin/testrun_l10n", line 8, in <module>
>11:28:54 load_entry_point('mozmill-automation==2.0.3', 'console_scripts', 'testrun_l10n')()
>11:28:54 File "/Users/mozauto/jenkins/workspace/mozilla-aurora_l10n/mozmill-env-mac/python-lib/mozmill_automation/testrun.py", line 751, in l10n_cli
>11:28:54 exec_testrun(L10nTestRun)
>11:28:54 File "/Users/mozauto/jenkins/workspace/mozilla-aurora_l10n/mozmill-env-mac/python-lib/mozmill_automation/testrun.py", line 730, in exec_testrun
>11:28:54 cls().run()
>11:28:54 File "/Users/mozauto/jenkins/workspace/mozilla-aurora_l10n/mozmill-env-mac/python-lib/mozmill_automation/testrun.py", line 357, in run
>11:28:54 print "*** Uninstalling build: %s" % self._folder
>11:28:54 AttributeError: 'L10nTestRun' object has no attribute '_folder'
>11:28:54 Build step 'Invoke XShell command' marked build as failure
>11:28:54 Archiving artifacts
>11:28:54 Recording test results
>11:28:54 IRC notifier plugin: Sending notification to: #automation
Comment 2•10 years ago
|
||
This is the wrong excerpt of the log from Jenkins! The problem is not in the testrun script but in mozdownload. How does it work on staging? Have you checked that Cosmin? I would love if we could output the real values and not the PARAMETERS. But that looks like a bug in the xshell plugin. Dave, what do you think?
Flags: needinfo?(dave.hunt)
Flags: needinfo?(cosmin.malutan)
Comment 3•10 years ago
|
||
You should be able to see this in the XShell log: https://github.com/jenkinsci/xshell-plugin/blob/master/src/main/java/hudson/plugins/xshell/XShellBuilder.java#L119 you can also see the values of the parameters in the Parameters page for the specified job. We should also consider adding some logging to mozdownload to indicate the target filename.
Flags: needinfo?(dave.hunt)
Assignee | ||
Comment 4•10 years ago
|
||
We have the same issue on staging.
Flags: needinfo?(cosmin.malutan)
Comment 5•10 years ago
|
||
(In reply to Dave Hunt (:davehunt) from comment #3) > You should be able to see this in the XShell log: > https://github.com/jenkinsci/xshell-plugin/blob/master/src/main/java/hudson/ > plugins/xshell/XShellBuilder.java#L119 you can also see the values of the > parameters in the Parameters page for the specified job. I know that you can see the parameters there, but you don't know what xshell is doing and how it calls the process. So this is not helpful. Also activating the log for xshell didn't give me any clue given that parameters have not been replaced by the values even. http://mm-ci-staging.qa.scl3.mozilla.com:8080/log/xshell/ (In reply to Cosmin Malutan from comment #4) > We have the same issue on staging. So please start the jenkins jlnp client from within a sourced mozmill-env on that box. With that you could add print statements to mozdownload py files and print the command line. That way we would know how mozdownload gets called.
Assignee | ||
Comment 6•10 years ago
|
||
As soon as the OptionParser instance in mozdownload/scraper.py returns the arguments the date option will be "$DATE". At the time we instantiate the TinderboxScraper class the date argument will be "$DATE" so it won't default to None. https://github.com/mozilla/mozdownload/blob/master/mozdownload/scraper.py#L964 As a workaround we can remove the --date=$DATE argument from: https://github.com/mozilla/mozmill-ci/blob/master/jenkins-master/jobs/mozilla-aurora_l10n/config.xml#L99 This will determine l10n testruns to run with the latest build, exactly as it should usually and as it ran on windows and ubuntu. I'm not sure why we give a date argument that we expect to be None so we will default to None.
Comment 7•10 years ago
|
||
(In reply to Cosmin Malutan from comment #6) > As soon as the OptionParser instance in mozdownload/scraper.py returns the > arguments the date option will be "$DATE". At the time we instantiate the > TinderboxScraper class the date argument will be "$DATE" so it won't default > to None. So Jenkins does not replace it with the real value then. That means we call mozdownload like `--date=$DATE`. Cosmin, please give us the exact command which gets executed. You can output it directly in mozmill-env-mac/run.sh > As a workaround we can remove the --date=$DATE argument from: > https://github.com/mozilla/mozmill-ci/blob/master/jenkins-master/jobs/ > mozilla-aurora_l10n/config.xml#L99 > This will determine l10n testruns to run with the latest build, exactly as > it should usually and as it ran on windows and ubuntu. I'm not sure why we > give a date argument that we expect to be None so we will default to None. Right, that was a solution I also proposed to Dave but he was not happy with that. As of now we will most likely never run tests for older tinderbox builds, which are also not existent! So we should be totally safe in removing this option. But lets give Dave a chance to reply.
Flags: needinfo?(dave.hunt)
Comment 8•10 years ago
|
||
The primary reason support for tinderbox builds was added was so we could find regression ranges for endurance issues. These issues are time consuming to replicate and often environment specific, so running on Jenkins was desirable. It's admittedly been rarely used, but my preference would be to find out why this is only failing on one environment and come up with a fix/workaround. I don't feel strongly enough to block removing this feature.
Flags: needinfo?(dave.hunt)
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → cosmin.malutan
Assignee | ||
Comment 9•10 years ago
|
||
>06:32:22 mozdownload --type=tinderbox --branch=mozilla-aurora --platform=mac --locale=ru --build-id=20140120004001 --date=$DATE --retry-attempts=10 --retry-delay=30 --directory=builds
>06:32:22 [mozilla-aurora_l10n] $ mozmill-env-$ENV_PLATFORM/run mozdownload --type=$BUILD_TYPE --branch=mozilla-aurora --platform=$PLATFORM --locale=$LOCALE --build-id=$BUILD_ID --date=$DATE --retry-attempts=10 --retry-delay=30 --directory=builds
>06:32:23 INFO | Downloading from: https://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-aurora-l10n/firefox-28.0a2.ru.mac.dmg
Here is the log from jenkins where I echo the command:
mozdownload --type=tinderbox --branch=mozilla-aurora --platform=mac --locale=ru --build-id=20140120004001 --date=$DATE --retry-attempts=10 --retry-delay=30 --directory=builds
When I executed this command in the environment it downloaded the latest build without a "$DATE" string attached as I said in firs comment.
So I guess is xshell plugin that works differently in jenkins.
Comment 10•10 years ago
|
||
Is this only affecting Mac? Is it all Mac versions? In comment 0 you mentioned that we have a file named $DATE-mozilla-aurora-firefox-28.0a2.hu.mac.dmg which implies mozdownload gets a literal "$DATE" value so something must be escaping the variable expansion. It looks from the stack trace that mozmill-automation has an issue with installing a DMG with $DATE in the name. Perhaps ultimately this is a mozmill-automation issue (though I agree it would be better not to have files with dollar signs in them, and worth finding out why this happens).
Comment 11•10 years ago
|
||
(In reply to Dave Hunt (:davehunt) from comment #10) > It looks from the stack trace that mozmill-automation has an issue with > installing a DMG with $DATE in the name. Perhaps ultimately this is a > mozmill-automation issue (though I agree it would be better not to have > files with dollar signs in them, and worth finding out why this happens). No. hdiutil fails when you are trying to install such a DMG. There is nothing we can do. It has to be fixed higher up the stack.
Comment 12•10 years ago
|
||
Well if it's a valid filename then it could potentially happen in other use cases. Shouldn't we sanitise the filename for hdiutil? Flagging Cosmin for needinfo so my question about affected Mac versions in comment 10 is not lost.
Flags: needinfo?(cosmin.malutan)
Assignee | ||
Comment 13•10 years ago
|
||
Dave the l10n testruns ran only on osx 10.9 so I had to trigger testrans on all nodes and all are affected.
Flags: needinfo?(cosmin.malutan)
Comment 14•10 years ago
|
||
Thanks, good to know this is a general Mac issue. Should be easy enough to replicate and debug then. You have a local CI, right?
Assignee | ||
Comment 15•10 years ago
|
||
I had found the root problem here, the build_filename method in mozdownlod get's overwrited by each subclass(TinderboxScraper,DailyScraper,ReleaseScraper) and the method under the TinderboxScraper class dosen't treat , if we enhance the metod from TinderboxScraper: https://github.com/mozilla/mozdownload/blob/master/mozdownload/scraper.py#L727 so it will check the type of self.timestamp it will work just fine and it will still work if we gave date If this solution sounds good for you guys I will file an issue on github for this. > try: > timestamp = self.date.strftime('%Y-%m-%d') > except: > timestamp = time.strftime('%Y-%m-%d', time.gmtime())
Flags: needinfo?(hskupin)
Flags: needinfo?(dave.hunt)
Comment 16•10 years ago
|
||
Nice find Cosmin, Its hard for me to follow where your proposed solution should go... It might better to add a PR in github so everyone can follow the code.
Comment 17•10 years ago
|
||
Needinfo but I can't see a question..? I agree with Andrei that a pull request or diff will make your comment much clearer. I'm not sure I currently understand the issue or proposed fix, but it's great to hear this is something we can fix in mozdownload.
Flags: needinfo?(dave.hunt)
Comment 18•10 years ago
|
||
Do we know why this only happens for OS X and not for Linux and Windows?
Flags: needinfo?(hskupin)
Assignee | ||
Comment 19•10 years ago
|
||
(In reply to Dave Hunt (:davehunt) from comment #17) > Needinfo but I can't see a question..? I created issue https://github.com/mozilla/mozdownload/issues/196 (In reply to Henrik Skupin (:whimboo) from comment #18) > Do we know why this only happens for OS X and not for Linux and Windows? Not sure but probably if on Linux or Window an environment variable is empty it will send an "empty" string, on mac it sends the name of the variable, in this particular case at least.
Comment 20•10 years ago
|
||
(In reply to Cosmin Malutan from comment #19) > (In reply to Henrik Skupin (:whimboo) from comment #18) > > Do we know why this only happens for OS X and not for Linux and Windows? > Not sure but probably if on Linux or Window an environment variable is empty > it will send an "empty" string, on mac it sends the name of the variable, in > this particular case at least. As said earlier this also sounds like a bug in Jenkins. You might want to have a look at the JIRA database for existing reports.
Assignee | ||
Comment 21•10 years ago
|
||
Here is an issue on xShell plugin: https://issues.jenkins-ci.org/browse/JENKINS-20478 We might wait for this to be fixed or go with the fix. Thanks
Comment 22•10 years ago
|
||
(In reply to Cosmin Malutan from comment #19) > > Do we know why this only happens for OS X and not for Linux and Windows? > Not sure but probably if on Linux or Window an environment variable is empty > it will send an "empty" string, on mac it sends the name of the variable, in > this particular case at least. That is something I have already requested in comment 7 (January 22nd). So finally please report back your results. This problem has to be fixed ASAP because it causes huge issues for localizers. Nearly none of them have OS X available for testing. So lets get this fixed ASAP. I'm raising the priority to P1.
Priority: -- → P1
Comment 23•10 years ago
|
||
(In reply to Dave Hunt (:davehunt) [Unavailable until at least 10th February] from comment #8) > The primary reason support for tinderbox builds was added was so we could > find regression ranges for endurance issues. These issues are time consuming But this does not apply to l10n tinderbox builds! Those are always overwritten once a new build is up. We do not store a history of builds as for en-US. So for the l10n testrun it doesn't make sense to fold in the DATE parameter at all. I totally agree for other jobs like endurance tests, but those are not l10n related and we will only execute for en-US builds. So I do not see why we shouldn't remove the $DATE parameter or at least the --date option for the mozdownload call. For now I temporarily patched the production instance for the latter, so we can at least run tests on OS X.
Updated•10 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 24•10 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #22) > That is something I have already requested in comment 7 (January 22nd). So > finally please report back your results. > > This problem has to be fixed ASAP because it causes huge issues for > localizers. Nearly none of them have OS X available for testing. So lets get > this fixed ASAP. I'm raising the priority to P1. As I said I couldn't find out why it fails only on mac, though I found a similar issue. I worked on this most of the day and I still didn't found a reliable way to get the timestamp from a datetime object, this method is not implemented in python 2.7. I tried : > timestamp = datetime.fromtimestamp(float(self.date), self.timezone) > totalseconds = (timestamp - datetime(1970, 1, 1)).total_seconds() this fails probably because I can't get the datetime(1970, 1, 1) with a timezone, and without that it fails on travis-CI. I suggest we try datetime.fromtimestamp, and if no error is thrown we assign the self.date to self.timestamp as before.
Comment 25•10 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #23) > But this does not apply to l10n tinderbox builds! Those are always > overwritten once a new build is up. We do not store a history of builds as > for en-US. So for the l10n testrun it doesn't make sense to fold in the DATE > parameter at all. I totally agree for other jobs like endurance tests, but > those are not l10n related and we will only execute for en-US builds. > > So I do not see why we shouldn't remove the $DATE parameter or at least the > --date option for the mozdownload call. For now I temporarily patched the > production instance for the latter, so we can at least run tests on OS X. Sounds fine to me to remove the DATE parameter/argument from the l10n jobs until the underlying issue is resolved. Another option could be to provide a default value for this parameter with special meaning in mozdownload, such as 'latest'.
Comment 26•10 years ago
|
||
This bug has no longer any remaining action items and can really be closed. Mozmill-CI has been updated and mozdownload is about to get an update too via a separate github issue.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Product: Mozilla QA → Mozilla QA Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•