Closed Bug 992461 Opened 6 years ago Closed 6 years ago

Intermittent asm.js\testHeapAccess.js | Timeout (code -1, args "--ion-eager --ion-parallel-compile=off --ion-check-range-analysis --no-sse3")

Categories

(Core :: JavaScript Engine: JIT, defect)

31 Branch
x86_64
Windows 8.1
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla34
Tracking Status
firefox32 --- fixed
firefox33 --- fixed
firefox34 --- fixed
firefox-esr24 --- unaffected
firefox-esr31 --- fixed

People

(Reporter: KWierso, Assigned: luke)

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

https://tbpl.mozilla.org/php/getParsedLog.php?id=37306162&tree=Mozilla-Inbound
WINNT 6.2 mozilla-inbound debug test jittest on 2014-04-04 16:46:47 PDT for push bf1c135138c4

slave: t-w864-ix-092




16:57:49     INFO -  TEST-PASS | tests\jit-test\jit-test\tests\asm.js\testNeuter.js | Success (code 0, args "--baseline-eager --no-fpu")
16:57:49     INFO -  TEST-PASS | tests\jit-test\jit-test\tests\asm.js\testNeuter.js | Success (code 0, args "--baseline-eager")
16:58:25     INFO -  TEST-PASS | tests\jit-test\jit-test\tests\asm.js\testNeuter.js | Success (code 0, args "--no-baseline --no-ion")
16:58:25     INFO -  TEST-PASS | tests\jit-test\jit-test\tests\asm.js\testParallelCompile.js | Success (code 0, args "")
16:58:25     INFO -  TEST-PASS | tests\jit-test\jit-test\tests\asm.js\testParallelCompile.js | Success (code 0, args "--ion-eager --ion-parallel-compile=off")
16:58:25     INFO -  TEST-PASS | tests\jit-test\jit-test\tests\asm.js\testMathLib.js | Success (code 0, args "--no-baseline --no-ion")
16:58:25     INFO -  TEST-PASS | tests\jit-test\jit-test\tests\asm.js\testParallelCompile.js | Success (code 0, args "--baseline-eager --no-fpu")
16:58:25     INFO -  TIMEOUT - asm.js\testHeapAccess.js
16:58:25  WARNING -  TEST-UNEXPECTED-FAIL | tests\jit-test\jit-test\tests\asm.js\testHeapAccess.js | Timeout (code -1, args "")
16:58:25     INFO -  INFO exit-status     : -1
16:58:25     INFO -  INFO timed-out       : True
16:58:25     INFO -  TEST-PASS | tests\jit-test\jit-test\tests\asm.js\testRangeAnalysis.js | Success (code 0, args "")
16:58:25     INFO -  TIMEOUT - asm.js\testHeapAccess.js
16:58:25  WARNING -  TEST-UNEXPECTED-FAIL | tests\jit-test\jit-test\tests\asm.js\testHeapAccess.js | Timeout (code -1, args "--ion-eager --ion-parallel-compile=off")
16:58:25     INFO -  INFO exit-status     : -1
16:58:25     INFO -  INFO timed-out       : True
16:58:25     INFO -  TIMEOUT - asm.js\testHeapAccess.js
16:58:25  WARNING -  TEST-UNEXPECTED-FAIL | tests\jit-test\jit-test\tests\asm.js\testHeapAccess.js | Timeout (code -1, args "--ion-eager --ion-parallel-compile=off --ion-check-range-analysis --no-sse3")
16:58:25     INFO -  INFO exit-status     : -1
16:58:25     INFO -  INFO timed-out       : True
16:58:25     INFO -  TEST-PASS | tests\jit-test\jit-test\tests\asm.js\testRangeAnalysis.js | Success (code 0, args "--ion-eager --ion-parallel-compile=off")
16:58:25     INFO -  TEST-PASS | tests\jit-test\jit-test\tests\asm.js\testRangeAnalysis.js | Success (code 0, args "--baseline-eager --no-fpu")
Luke, any idea why this test is timing out on Windows 8?
Flags: needinfo?(luke)
testHeapAccess does a ton of out-of-bounds accesses (testing the correct handling), and so one idea is that vectored exception handling on Windows 8 is really slow.

Is Windows 8 running on real hardware or in a VM?  Presumably, exception handling could be really really slow on a VM.
Flags: needinfo?(luke)
(In reply to Luke Wagner [:luke] from comment #20)
> Is Windows 8 running on real hardware or in a VM?  Presumably, exception
> handling could be really really slow on a VM.

Ryan can you answer this?

If the test is just doing something that's expected to be slow, maybe we could split it up.
Flags: needinfo?(ryanvm)
Currently, all Windows tests run on real hardware.
Flags: needinfo?(ryanvm)
Is there any way to get log output from jittests?  Ideally, I'd like to land a print() in testHeapAccess.js that reported the time the test took so we could see what the normal time is in Win8 and compare that to the timeout limit.
At this point, I'm not even sure who knows/owns the jit-test harness. Honestly, I think jandem's done the most patching of it of late ;)
(In reply to Luke Wagner [:luke] from comment #23)
> Is there any way to get log output from jittests?  Ideally, I'd like to land
> a print() in testHeapAccess.js that reported the time the test took so we
> could see what the normal time is in Win8 and compare that to the timeout
> limit.

I don't think we have that atm, but it should be possible to dump stdout/stderr in tests/lib/jittests.py or add some logging to run_timeout_cmd there. Then we can push to Try and see what happens on different platforms.
Oh I think you want to modify print_tinderbox in tests/lib/jittests.py

if "SOME MAGIC STRING" in res.out:
    for line in res.out.splitlines():
        print("INFO stdout          > " + line.strip())

Or something.
Sigh, this is getting annoying. Here's a Try push to log the time of this test, as I suggested in comment 25:

https://tbpl.mozilla.org/?tree=Try&rev=9f028913a8fa

Let's see what we get on different platforms.
Some timings from the Try push, all are opt builds:

OS X:     ~180 ms
Linux 64: ~350 ms

Win XP:  3035,  3020,  2810,  2857,  2822,  2882
Win 7 : 67692, 67646, 67493, 67774, 67807, 67581
Win 8 : 74474, 74633, 70863, 70983, 71279, 71378

So on Windows 7+ this test takes at least a minute. Windows 8 is even slower and apparently the test sometimes takes more than the 90 seconds timeout...

Luke, can you take it from here?
Flags: needinfo?(luke)
Wow, thanks for doing those measurements.  I just tried on Win7 and the test only takes a few seconds to complete so I guess this just means Win 7 and 8 are on emulators and those emulators don't appreciate SIGSEGV.

I wonder what the right fix is here.  I could add a platform field to getBuildCongfig(), but it seems wrong to disable the test for all of windows just b/c we run windows tests on emulators.  Instead, I was thinking I could just make the tests silently succeed if more than, say, 10s elapsed.  Sound good Jan?
Flags: needinfo?(luke)
What do you mean by "emulators" here? Note that none of our Windows or OSX machines run on VMs. Only Linux-based platforms.
I see, then it's really weird that Win7/8 are running slow slowly here and also that I'm not able to reproduce.  testZOOB.js also does a bunch of compilation and heap-access testing, so I pushed a patch to try to see what it's time looks like:

  https://tbpl.mozilla.org/?tree=Try&rev=11b7f64e4825
Weird, testZOOB runs in <1s on Win7/Win8.  This patch prints the time at 25 checkpoints throughout testHeapAccess to see if there is just part of the test taking all the time.  This is a pretty plain test, the only thing perhaps irregular is that it allocates a bunch of page-sized ArrayBuffers (but it's on x86, so it's not even doing the memory-mapping stuff).
Wow, the times are totally evenly distributed.  This test breaks the timing down to finer granularity:
 https://tbpl.mozilla.org/?tree=Try&rev=5d527ebb0ccc
Results shows that the compilation step (not linking, not buffer creation) is taking ~200ms for *each* test case.  testZOOB shows only a few ms.  One difference is testZOOB disables caching, so let's see if that speeds up testHeapAccess:
  https://tbpl.mozilla.org/?tree=Try&rev=f6221012303a
If it does, we can just flip the default for caching in the shell to 'off' and it might be a nice general test-suite time reduction.
(In reply to Luke Wagner [:luke] from comment #119)
> Results shows that the compilation step (not linking, not buffer creation)
> is taking ~200ms for *each* test case.  testZOOB shows only a few ms.  One
> difference is testZOOB disables caching, so let's see if that speeds up
> testHeapAccess:

Thanks for tracking this down! With caching enabled, do we start a new process for each test case? jit-tests run in parallel, maybe that's putting more pressure on the kernel somehow.
That seems to fix it!

(In reply to Jan de Mooij [:jandem] from comment #120)
> Thanks for tracking this down! With caching enabled, do we start a new
> process for each test case? jit-tests run in parallel, maybe that's putting
> more pressure on the kernel somehow.

With caching enabled (and with --js-cache=$(DIR) passed to the shell, which jit_tests.py does), every asm.js compilation writes the compiled module to a file (in $(DIR)) and does the equivalent of an fsync (which I expect is what's slow here, either b/c Windows or b/c this machine is doing more stuff concurrently).
Assignee: nobody → luke
Status: NEW → ASSIGNED
Attachment #8472357 - Flags: review?(benj)
Comment on attachment 8472357 [details] [diff] [review]
change-caching-default

Review of attachment 8472357 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, thanks for doing that.
Attachment #8472357 - Flags: review?(benj) → review+
https://hg.mozilla.org/mozilla-central/rev/0475b6c98bb5
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Can we nominate this for aurora/beta/esr31 uplift please? :)
Comment on attachment 8472357 [details] [diff] [review]
change-caching-default

[Approval Request Comment]
This is a js-shell- and test-only change.  It doesn't affect FF.  It'll make the sheriffs' lives easier.

Risk to taking this patch (and alternatives if risky): zero
Attachment #8472357 - Flags: approval-mozilla-esr31?
Attachment #8472357 - Flags: approval-mozilla-beta?
Attachment #8472357 - Flags: approval-mozilla-aurora?
Flags: needinfo?(luke)
Attachment #8472357 - Flags: approval-mozilla-esr31?
Attachment #8472357 - Flags: approval-mozilla-esr31+
Attachment #8472357 - Flags: approval-mozilla-beta?
Attachment #8472357 - Flags: approval-mozilla-beta+
Attachment #8472357 - Flags: approval-mozilla-aurora?
Attachment #8472357 - Flags: approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.