463487 - TM: 20x slowdown with jit on JS raytracer

Reporter

Description

•

17 years ago

Attached file JS shell testcase to be run with -b 10000000 — Details

Testing with today's tracemonkey browser on the URL in the URL field, I get a 3500ms render time without jit, and a slow script dialog (so at least 10s) with jit. I stripped out the HTML-dependent stuff, and the resulting js shell testcase runs in the following times, based on command-line args: (none) 3200ms -j 2600ms -b 10000000 3500ms -j -b 10000000 62000ms (no, I did _not_ accidentally put an extra zero there!) It's not a very minimal testcase, but it's a start, at least. Clearly gc is screwing us over somehow...

Boris Zbarsky [:bzbarsky]

Reporter

Comment 1

•

17 years ago

Note: this was spun off of bug 460964. I can't get a verbose trace without -b because of bug 463490. With -b 10000000, I waited long enough to get a 300MB-some log file, at which point: ~% grep Abort ~/log.txt | sort | uniq -c | sort -nr 443 Abort recording (line 164, pc 67): No compatible inner tree. 432 Abort recording (line 185, pc 442): No compatible inner tree. 431 Abort recording (line 91, pc 118): Inner tree not suitable for calling. 102 Abort recording (line 65, pc 27): Inner tree not suitable for calling. 2 Abort recording (line 229, pc 102): No compatible inner tree. 1 Abort recording (line 88, pc 94): No compatible inner tree. ~% grep "recording starting from" ~/log.txt | wc -l 5413 Which is not that different from what we see in bug 463478 in some ways.

Flags: blocking1.9.1?

Keywords: regression

Robert Sayre

Updated

•

16 years ago

Flags: blocking1.9.1? → blocking1.9.1+

Robert Sayre

Updated

•

16 years ago

Priority: -- → P2

Brian Crowder

Comment 2

•

16 years ago

Can this be closed out? crowdermac:src crowder$ ./js ~/raytrace-test.js 4345 crowdermac:src crowder$ ./js -j ~/raytrace-test.js 3022

Boris Zbarsky [:bzbarsky]

Reporter

Comment 3

•

16 years ago

That matches comment 0, yes. Please either run the original testcase in the browser, or run this one in the shell with -b, as comment 0 says.

Brian Crowder

Comment 4

•

16 years ago

Ah, right. This still reproduces identically.

David Mandelin [:dmandelin]

Comment 5

•

16 years ago

I get these results now: guest-225:opt dmandelin$ dist/bin/js -b 10000000 ../2ray.js 3788 guest-225:opt dmandelin$ dist/bin/js -j -b 10000000 ../2ray.js 6000

David Mandelin [:dmandelin]

Comment 6

•

16 years ago

I still get long and unpredictable times in the browser, though.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 7

•

16 years ago

Looks like in shell the testcase behavior changed as of changeset 8f8a4b924c32 (in tracemonkey). This matches the range in which the behavior of bug 463478 changed. Was this a real fix for this problem, or just a change in GC behavior that hides it? And if the latter, would that explain why browser still has a problem?

Graydon Hoare :graydon

Assignee

Updated

•

16 years ago

Assignee: general → graydon

Graydon Hoare :graydon

Assignee

Updated

•

16 years ago

Depends on: 476653

Graydon Hoare :graydon

Assignee

Comment 8

•

16 years ago

After the landing of bug 476653, this *appears* to no longer give wildly unpredictable time for me: I get a little instability, but the range is 400ms (!) to 1400 at the limits, with a pretty stable common case more around 900-1200. For comparison, the most recent chrome I'm testing side-by-side here clocks 800-900, so we're pretty competitive on this one. Our non-JIT time is more like 4500. We're also beating chrome on the linked animation test. This is win32 though. I'm going to wait to close this bug since I thought I saw perturbation on the linux system. Will want to do 3-platform tests to be sure.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 9

•

16 years ago

It'd be really nice if someone could answer my question from comment 7.....

Brendan Eich [:brendan]

Comment 10

•

16 years ago

bz: that would be nice. We need to understand the variability in timing in comment 8, *and* any browser-only behavior change or slowdown. Fun. /be

Graydon Hoare :graydon

Assignee

Comment 11

•

16 years ago

Some positive results: it runs at full speed on mac and linux in the browser as well, both the simple test and the animation test. At very good speed (~900ms simple, ~10ms/frame animation, regularly). In the shell, the statistics are also promising: recorder: started(32), aborted(10), completed(95), different header(0), trees trashed(0), slot promoted(0), unstable loop variable(20), breaks(3), returns(2), unstableInnerCalls(8) monitor: triggered(50059), exits(50059), type mismatch(0), global mismatch(0) Wrt. bz's comment #7, rev 8f8a4b924c32 was an OOM-handling bug; I have no idea why it would have affected performance, but I also don't really understand why you were passing the -b 10000000 flag in the first place, nor why either of those would cause the recorder to fail to find suitable inner trees too often. None of those are happening now, and what I was blocked on here until yesterday was yet another serious systemic bug, unrelated still. In general I tend to think what we had was a large stack (still nonempty, in the case of bug 463478) of *confounding* cross-the-JIT correctness errors, any one of which independently caused this test to fall down in a creatively different way, and so there is no single "thing to fix" here, nor any single thing we might be guilty of papering over. Each layer in the stack of confounding bugs was papering over the layer beneath it, and while it is difficult to prove when we've got to the bottom of that stack -- I'll admit, I can't prove that -- it is even more futile to look back up the stack and ask "was that the bug we were after?" We were after *all* of them. The story here is "keep knocking down correctness bugs until it starts behaving well". These larger raytracers seem to explore the breadth of the JIT much more than the microbenchmarks, and so quite a few systemic JIT bugs seem able to foul them up. There *is* remaining time-variability in this case -- it's most stable on mac, big surprise, though on all platforms it tends to stick near a mean of 900ms when run repeatedly -- but I'd suggest if not closing this bug and declaring we reached the bottom of the stack, then at least downgrading it to a non-blocking state. The JIT makes performance on this case 10x faster than interp *in all builds* now, and it runs w/o error. It's just not perfectly noiseless when running in a browser. That alone seems low priority, as it's hardly something new. Many perf tests have noise in a browser.

Brendan Eich [:brendan]

Comment 12

•

16 years ago

If we want to use this bug to cover lower-priority variability/noise shown well by the raytracer, the summary should change. /be

Boris Zbarsky [:bzbarsky]

Reporter

Comment 13

•

16 years ago

> I have no idea why it would have affected performance Yes, that was my problem too. > but I also don't really understand why you were passing the -b 10000000 flag > in the first place To trigger gc in the shell. Without that, the shell never GCed, and GC was key to triggering this bug in the first place. That's why browser was showing the problem but shell without -b was not. I assume current testing in the shell is using whatever the equivalent of -b is with the watchdog thread, right? GC would affect ability to find inner trees because when we gc we flush all cached trees, no? I'm fine with closing this bug out if we're reasonably sure that things are OK based on browser testing. If we want a new bug on the noise, we should file a followup, in my opinion, but I bet that the noise is due to the exact timing of GC in relation to the test run. Some experimentation in the shell might be able to prove or disprove that...

Robert Sayre

Comment 14

•

16 years ago

Please file appropriate followups, don't morph this bug.

Status: NEW → RESOLVED

Closed: 16 years ago

Resolution: --- → WORKSFORME