Closed Bug 858621 Opened 12 years ago Closed 11 years ago

Run jit-tests from test package

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ted, Assigned: dminor)

References

(Depends on 1 open bug)

Details

Attachments

(1 file, 1 obsolete file)

We should run the jit-tests from the test package instead of during "make check".
Depends on: 903620
Product: mozilla.org → Release Engineering
Depends on: 906525
Assignee: nobody → dminor
Depends on: 908906
I have completed the changes to the desktop scripts and have started working on the remote versions.
Status: NEW → ASSIGNED
Depends on: 912004
This is working locally and I have an Ash run here: https://tbpl.mozilla.org/?tree=Ash&rev=6489d9abf96c.
Attachment #799689 - Flags: review?(aki)
Comment on attachment 799689 [details] [diff] [review]
Patch to add jit-tests support to mozharness scripts and configs.

I'm guessing your test run on ash shows that this doesn't break anything else, but doesn't actually run the jit-tests, correct?
Should be good enough to land.


>+    def query_jsshell_url(self):
>+        if self.jsshell_url:
>+            return self.jsshell_url
>+        if not self.installer_url:
>+            self.fatal("Can't figure out jsshell without an installer_url!")
>+
>+        last_slash = self.installer_url.rfind('/')
>+        base_url = self.installer_url[:last_slash]
>+
>+        for suffix in INSTALLER_SUFFIXES:
>+            if self.installer_url.endswith(suffix):
>+                no_suffix = self.installer_url[:-len(suffix)]
>+                last_dot = no_suffix.rfind('.')
>+                platform = no_suffix[last_dot + 1:]
>+
>+                self.jsshell_url = base_url + '/jsshell-' + platform + '.zip'
>+                return self.jsshell_url
>+        else:
>+            self.fatal("Can't figure out symbols_url from installer_url %s!" % self.installer_url)

I think the fatal() message is wrong.
Can you add a docstring to this method explaining what this is for?

>@@ -352,7 +369,10 @@ class PandaTest(TestingMixin, MercurialScript, VirtualenvMixin, MozpoolMixin, Bu
>         #applies to mochitest, reftest, jsreftest
>         # TestingMixin._download_and_extract_symbols() will set
>         # self.symbols_path when downloading/extracting.
>-        hostnumber = int(self.mozpool_device.split('-')[1])
>+        hostnumber = 0
>+        mozpool_device_list = self.mozpool_device.split('-')
>+        if len(mozpool_device_list) == 2:
>+            hostnumber = int(mozpool_device_list[1])

I'm guessing you hit an error that led to this change... what happened?


I only have comment/log message nits, so I'm r+ing.
However, I'm flagging :kmoir for feedback/FYI, especially on the panda changes.
Attachment #799689 - Flags: feedback?(kmoir)
Thanks for the review. Yes, the Ash run just shows that this should avoid breakage when it lands.

The hostnumber change helps when running from my local system. The mozpool_device in production has a number in its name, but I use the ip address of my panda when testing locally, which breaks this part of the code. I can leave this change out, but it will help me (and possibly others) when testing locally in the future.
Comment on attachment 799689 [details] [diff] [review]
Patch to add jit-tests support to mozharness scripts and configs.

Hm, I think I failed to set the flag.
Attachment #799689 - Flags: review?(aki) → review+
Comment on attachment 799689 [details] [diff] [review]
Patch to add jit-tests support to mozharness scripts and configs.

lgtm
Attachment #799689 - Flags: feedback?(kmoir) → feedback+
With review comments addressed.
Attachment #799689 - Attachment is obsolete: true
Blocks: 912997
Comment on attachment 800148 [details] [diff] [review]
Patch to add jit-tests support to mozharness scripts and configs.

Interdiff looks good.
Attachment #800148 - Flags: review+
Comment on attachment 800148 [details] [diff] [review]
Patch to add jit-tests support to mozharness scripts and configs.

https://hg.mozilla.org/build/mozharness/rev/bd35e3e2b331
Attachment #800148 - Flags: checked-in+
in production
dminor: I'm getting this warning from pyflakes:

mozharness/mozilla/testing/testbase.py:96: redefinition of unused 'platform' from line 10

Looks like we should use a different name, since platform is an imported python built-in module ?  Do you have a preference? pf ?
Flags: needinfo?(dminor)
We use 'pltfrm' elsewhere in the same file, so maybe that. 'pf' would be fine too.
Flags: needinfo?(dminor)
Depends on: 919145
Depends on: 929472
Now that Bug 929569 and Bug 929125 are resolved, it Looks like the latest runs on Cedar are down to intermittent failures.

I think the next steps are either to disable the intermittent tests, or to enable jit-tests on try so we can debug them.
Terrence, any preference?
Depends on: 929569, 929125, 916742, 918934, 917309
Flags: needinfo?(terrence)
Lets enable them hidden on Try so that these failing tests have non-zero visibility. This should be good incentive for us to take a harder look at the recent spate of intermittent failures we've been experiencing.
Flags: needinfo?(terrence)
Depends on: 931874
Depends on: 939274
Depends on: 959155
Depends on: 959156
Depends on: 959158
Depends on: 959164
Depends on: 961212
Depends on: 967467
jit-tests are currently running during make check. They already have those various issues that are "blocking" this bug. Why are we waiting? The net benefit is that this would free build slaves for other jobs, allowing us to use less build slaves, which in turn would make builds faster because the more slaves we have the slower the builds are (see http://glandium.org/blog/?p=3170)

So can we please deploy this now and stop doing jit-tests during make check?
No? https://tbpl.mozilla.org/?tree=Cedar&jobname=jit shows they're not green.
(In reply to Mike Hommey [:glandium] from comment #15)
> jit-tests are currently running during make check. They already have those
> various issues that are "blocking" this bug. Why are we waiting? The net
> benefit is that this would free build slaves for other jobs, allowing us to
> use less build slaves, which in turn would make builds faster because the
> more slaves we have the slower the builds are (see
> http://glandium.org/blog/?p=3170)
> 
> So can we please deploy this now and stop doing jit-tests during make check?

I've been actively working on fixing a few consistent failures over the past couple of weeks and I have one left to fix for the desktop builds at which point we can remove the jit-tests from make check. If I don't get anywhere today, I'm sure we can disable it and move forward.

Since the android jit-tests are not currently part of 'make check' they can be greened up as a separate activity.
Dan has hit (and filed and fixed) a lot of issues running the jit-tests on the test slaves vs. the build slaves. Unsurprisingly, these tests have some issues about the environment they're running in.
Depends on: 968797
If it's possible to prioritize, I believe Windows is more important than other platforms.

Windows EC2 instances cost more than Linux.

JIT tests run slower on Windows than *NIX platforms because of new process overhead.

In EC2, it's likely costing 3-5x more to run JIT tests on Windows than on Linux.
We're not running anything windows on ec2.
Well, we are, but it's only on the Date twig.
we're only running windows tests on ec2 on date at this point
Depends on: 969986
Depends on: 969988
Depends on: 969991
Depends on: 973900
(In reply to Dan Minor [:dminor] from comment #17)
> (In reply to Mike Hommey [:glandium] from comment #15)
> > jit-tests are currently running during make check. They already have those
> > various issues that are "blocking" this bug. Why are we waiting? The net
> > benefit is that this would free build slaves for other jobs, allowing us to
> > use less build slaves, which in turn would make builds faster because the
> > more slaves we have the slower the builds are (see
> > http://glandium.org/blog/?p=3170)
> > 
> > So can we please deploy this now and stop doing jit-tests during make check?
> 
> I've been actively working on fixing a few consistent failures over the past
> couple of weeks and I have one left to fix for the desktop builds at which
> point we can remove the jit-tests from make check. If I don't get anywhere
> today, I'm sure we can disable it and move forward.
> 
> Since the android jit-tests are not currently part of 'make check' they can
> be greened up as a separate activity.

'Today' was a few weeks ago. What's the eta now?
(In reply to Taras Glek (:taras) from comment #23)
> 'Today' was a few weeks ago. What's the eta now?

That bug was fixed, but as usual, a new test machine only failure has appeared, see bug 906525.
I filed bug 973900 to schedule these. There are some blockers there as well.
(In reply to Dan Minor [:dminor] from comment #24)
> (In reply to Taras Glek (:taras) from comment #23)
> > 'Today' was a few weeks ago. What's the eta now?
> 
> That bug was fixed, but as usual, a new test machine only failure has
> appeared, see bug 906525.
> I filed bug 973900 to schedule these. There are some blockers there as well.

I'd like to see an ETA here.
We had to pull dminor off this temporarily because of emergency b2g certification work. Most of this is now in releng's court, aside from bug 906525, which shu is looking at right now. dminor is needinfoing releng where required. I don't think we can give an ETA right now due to this being worked on by several groups, but I'll monitor it.
Filed bug 988522 to disable the failing tests until bug 906525 is resolved, and kmoir is looking at 973900 to do the scheduling.

We'll probably want to run these in parallel with the make check for a few days to make sure no serious problems emerge before removing this from make check, so we can probably get that is wrapped up in two weeks.
Depends on: 988532
No longer depends on: 906525
Another intermittent:
https://tbpl.mozilla.org/php/getParsedLog.php?id=37168546&tree=Mozilla-Inbound
All four of those intermittents are timeouts on Windows; far as I know, everyone but me has been ignoring timeouts in jittests on Windows while run on the build since I started the chain at bug 970063, and I don't see any reason to believe these are any different than those. Other than the obvious difference, you haven't yet learned to ignore them unless and until I decide to file them ;)
Until now I've been ignoring check-test failures on the most part; I was attempting to 'do the right thing' here ;-)
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: