Closed Bug 910782 Opened 11 years ago Closed 11 years ago

indirect-goto-based interpreter

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla28

People

(Reporter: sunfish, Assigned: sunfish)

References

Details

(Whiteboard: [qa-])

Attachments

(19 files, 5 obsolete files)

interp-cleanup-comments.patch 11 years ago Dan Gohman [:sunfish] 2.06 KB, patch	jorendorff : review+	Details \| Diff \| Splinter Review
interp-cleanup-end-case.patch 11 years ago Dan Gohman [:sunfish] 3.14 KB, patch	terrence : review+	Details \| Diff \| Splinter Review
interp-cleanup-empty-cases.patch 11 years ago Dan Gohman [:sunfish] 4.23 KB, patch	luke : review+	Details \| Diff \| Splinter Review
interp-cleanup-macros.patch 11 years ago Dan Gohman [:sunfish] 7.02 KB, patch	luke : review+	Details \| Diff \| Splinter Review
interp-cleanup-branch.patch 11 years ago Dan Gohman [:sunfish] 3.27 KB, patch		Details \| Diff \| Splinter Review
interp-cleanup-backedge.patch 11 years ago Dan Gohman [:sunfish] 2.02 KB, patch	luke : review+	Details \| Diff \| Splinter Review
interp-cleanup-outer-for.patch 11 years ago Dan Gohman [:sunfish] 6.87 KB, patch	luke : review+	Details \| Diff \| Splinter Review
interp-cleanup-switch-mask.patch 11 years ago Dan Gohman [:sunfish] 7.70 KB, patch		Details \| Diff \| Splinter Review
interp-prep-do-macros.patch 11 years ago Dan Gohman [:sunfish] 8.83 KB, patch		Details \| Diff \| Splinter Review
interp-prep-advance-n.patch 11 years ago Dan Gohman [:sunfish] 7.32 KB, patch		Details \| Diff \| Splinter Review
interp-prep-scoping.patch 11 years ago Dan Gohman [:sunfish] 2.84 KB, patch	luke : review+	Details \| Diff \| Splinter Review
interp-indirect-goto.patch 11 years ago Dan Gohman [:sunfish] 22.06 KB, patch		Details \| Diff \| Splinter Review
interp-cleanup-branch.patch 11 years ago Dan Gohman [:sunfish] 4.79 KB, patch	luke : review+	Details \| Diff \| Splinter Review
interp-cleanup-switch-mask.patch 11 years ago Dan Gohman [:sunfish] 7.70 KB, patch	luke : review+	Details \| Diff \| Splinter Review
interp-prep-do-macros.patch 11 years ago Dan Gohman [:sunfish] 8.81 KB, patch	luke : review+	Details \| Diff \| Splinter Review
interp-prep-advance-and-dispatch.patch 11 years ago Dan Gohman [:sunfish] 7.33 KB, patch	luke : review+	Details \| Diff \| Splinter Review
interp-cleanup-case-macro-names.patch 11 years ago Dan Gohman [:sunfish] 44.87 KB, patch	luke : review+	Details \| Diff \| Splinter Review
interp-cleanup-cases.patch 11 years ago Dan Gohman [:sunfish] 3.97 KB, patch	luke : review+	Details \| Diff \| Splinter Review
interp-cleanup-var.patch 11 years ago Dan Gohman [:sunfish] 2.11 KB, patch	luke : review+	Details \| Diff \| Splinter Review
interp-indirect-goto.patch 11 years ago Dan Gohman [:sunfish] 23.43 KB, patch	luke : review+	Details \| Diff \| Splinter Review
interp-indirection.patch 11 years ago Dan Gohman [:sunfish] 100.83 KB, patch	luke : review+	Details \| Diff \| Splinter Review
interp-cleanup-interpret.patch 11 years ago Dan Gohman [:sunfish] 4.44 KB, patch	luke : review+	Details \| Diff \| Splinter Review
interp-micro-mask.patch 11 years ago Dan Gohman [:sunfish] 1.37 KB, patch	luke : review+	Details \| Diff \| Splinter Review
interp-absolute.patch 11 years ago Dan Gohman [:sunfish] 2.27 KB, patch	luke : review+	Details \| Diff \| Splinter Review

Dan Gohman [:sunfish]

Assignee

Description

•

11 years ago

For a tangentially-related experiment, I (re-)implemented an indirect-goto-based interpreter loop. I'm aware of some reasons why this may not be desirable upstream, so I don't mind if this isn't accepted, but I'm sending it upstream in case someone finds it interesting. It speeds up the interpreter by about 10-15%, but this is typically only noticeable when the JITs are disabled.

Along the way, I also made several cleanup patches, which I split out in case they are desirable even if the main indirect-goto patch is not.

Some differences between this new interp-indirect-goto.patch and the code removed in bug 799777 include:
 - It gets rid of the global (to the interpreter loop) len and op variables, and significantly reduces the scope of the switchOp variable.
 - The table contains offsets instead of absolute addresses, which is more PIC/PIE friendly (though it is slightly slower).
 - There's only one table; the interrupt mechanism is simpler.
 - The indirect-goto code shares more with the switch code; there's only one #ifdef.

Dan Gohman [:sunfish]

Assignee

Comment 1

•

11 years ago

Attached patch interp-cleanup-comments.patch — Details — Splinter Review

Assignee: general → sunfish

Dan Gohman [:sunfish]

Assignee

Comment 2

•

11 years ago

Attached patch interp-cleanup-end-case.patch — Details — Splinter Review

Dan Gohman [:sunfish]

Assignee

Comment 3

•

11 years ago

Attached patch interp-cleanup-empty-cases.patch — Details — Splinter Review

Dan Gohman [:sunfish]

Assignee

Comment 4

•

11 years ago

Attached patch interp-cleanup-macros.patch — Details — Splinter Review

Dan Gohman [:sunfish]

Assignee

Comment 5

•

11 years ago

Attached patch interp-cleanup-branch.patch (obsolete) — Details — Splinter Review

Dan Gohman [:sunfish]

Assignee

Comment 6

•

11 years ago

Attached patch interp-cleanup-backedge.patch — Details — Splinter Review

Dan Gohman [:sunfish]

Assignee

Comment 7

•

11 years ago

Attached patch interp-cleanup-outer-for.patch — Details — Splinter Review

Dan Gohman [:sunfish]

Assignee

Comment 8

•

11 years ago

Attached patch interp-cleanup-switch-mask.patch (obsolete) — Details — Splinter Review

Dan Gohman [:sunfish]

Assignee

Comment 9

•

11 years ago

Attached patch interp-prep-do-macros.patch (obsolete) — Details — Splinter Review

Dan Gohman [:sunfish]

Assignee

Comment 10

•

11 years ago

Attached patch interp-prep-advance-n.patch (obsolete) — Details — Splinter Review

Dan Gohman [:sunfish]

Assignee

Comment 11

•

11 years ago

Attached patch interp-prep-scoping.patch — Details — Splinter Review

Dan Gohman [:sunfish]

Assignee

Comment 12

•

11 years ago

Attached patch interp-indirect-goto.patch (obsolete) — Details — Splinter Review

This patch is the one the actually implements the indirect goto feature.

Dan Gohman [:sunfish]

Assignee

Comment 13

•

11 years ago

Comment on attachment 797337 [details] [diff] [review]
interp-cleanup-comments.patch

Looks like the patch in bug 877599 accidentally re-introduced an old comment. This patch removes it again, plus one other unneeded comment.

Attachment #797337 - Flags: review?(jorendorff)

Dan Gohman [:sunfish]

Assignee

Comment 14

•

11 years ago

Comment on attachment 797338 [details] [diff] [review]
interp-cleanup-end-case.patch

Some fairly straight-forward cleanups.

Attachment #797338 - Flags: review?(terrence)

Dan Gohman [:sunfish]

Assignee

Updated

•

11 years ago

Whiteboard: [leave open]

Terrence Cole [:terrence]

Comment 15

•

11 years ago

Comment on attachment 797338 [details] [diff] [review]
interp-cleanup-end-case.patch

Review of attachment 797338 [details] [diff] [review]:
-----------------------------------------------------------------

Nice! \o/

r=me

Attachment #797338 - Flags: review?(terrence) → review+

Luke Wagner [:luke]

Comment 16

•

11 years ago

We removed the indirect-goto code last time because it didn't seem to help performance and it was nice to avoid the extra macro goop.  Do you see any interesting cases where this improves performance?  (All the preparatory interp cleanup patches of course would be nice to land.)

Dan Gohman [:sunfish]

Assignee

Comment 17

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/14d93040175b

Dan Gohman [:sunfish]

Assignee

Comment 18

•

11 years ago

(In reply to Luke Wagner [:luke] from comment #16)
> We removed the indirect-goto code last time because it didn't seem to help
> performance and it was nice to avoid the extra macro goop.  Do you see any
> interesting cases where this improves performance?  (All the preparatory
> interp cleanup patches of course would be nice to land.)

I don't have any general-interest cases where this improves performance, but I haven't really looked. If I decide to propose the indirect-goto patch for review, I'll definitely collect some more data to support it. For now I'll just work on landing the cleanups.

Luke Wagner [:luke]

Comment 19

•

11 years ago

(In reply to Dan Gohman [:sunfish] from comment #18)
Sounds great!

Dan Gohman [:sunfish]

Assignee

Updated

•

11 years ago

Attachment #797339 - Flags: review?(luke)

Dan Gohman [:sunfish]

Assignee

Updated

•

11 years ago

Attachment #797340 - Flags: review?(luke)

Luke Wagner [:luke]

Updated

•

11 years ago

Attachment #797339 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Updated

•

11 years ago

Attachment #797340 - Flags: review?(luke) → review+

Dan Gohman [:sunfish]

Assignee

Updated

•

11 years ago

Attachment #797342 - Flags: review?(luke)

Dan Gohman [:sunfish]

Assignee

Comment 20

•

11 years ago

Attached patch interp-cleanup-branch.patch — Details — Splinter Review

Attachment #797341 - Attachment is obsolete: true

Attachment #798641 - Flags: review?(luke)

Dan Gohman [:sunfish]

Assignee

Comment 21

•

11 years ago

Comment on attachment 797343 [details] [diff] [review]
interp-cleanup-outer-for.patch

This diff looks pretty crazy, it's mostly indentation changes. The regular cases in the interpreter loop body are left-justified, so this patch left-justifies the special cases for consistency, but feel free to disagree with that choice :-).

Attachment #797343 - Flags: review?(luke)

Dan Gohman [:sunfish]

Assignee

Comment 22

•

11 years ago

Attached patch interp-cleanup-switch-mask.patch — Details — Splinter Review

Attachment #797344 - Attachment is obsolete: true

Attachment #798647 - Flags: review?(luke)

Dan Gohman [:sunfish]

Assignee

Comment 23

•

11 years ago

Comment on attachment 797347 [details] [diff] [review]
interp-prep-scoping.patch

I named this patch with "prep", but it's really a cleanup. InvokeState contains a RootedScript, so it's nice to give it a more explicit lifetime.

Attachment #797347 - Flags: review?(luke)

Ed Morley [:emorley]

Comment 24

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/14d93040175b
https://hg.mozilla.org/mozilla-central/rev/7ce923b4b074
https://hg.mozilla.org/mozilla-central/rev/c913196e053f

Jason Orendorff [:jorendorff]

Comment 25

•

11 years ago

Comment on attachment 797337 [details] [diff] [review]
interp-cleanup-comments.patch

Review of attachment 797337 [details] [diff] [review]:
-----------------------------------------------------------------

embarrassing

Attachment #797337 - Flags: review?(jorendorff) → review+

Luke Wagner [:luke]

Updated

•

11 years ago

Attachment #797342 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Updated

•

11 years ago

Attachment #797343 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Updated

•

11 years ago

Attachment #797347 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Comment 26

•

11 years ago

Comment on attachment 798641 [details] [diff] [review]
interp-cleanup-branch.patch

Review of attachment 798641 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/Interpreter.cpp
@@ +1303,5 @@
>  
>  #define BRANCH(n)                                                             \
>      JS_BEGIN_MACRO                                                            \
> +        int32_t nlen = (n);                                                   \
> +        regs.pc += (nlen);                                                    \

does nlen need the parens?

Attachment #798641 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Updated

•

11 years ago

Attachment #798647 - Flags: review?(luke) → review+

Dan Gohman [:sunfish]

Assignee

Comment 27

•

11 years ago

(In reply to Luke Wagner [:luke] from comment #26)
> Comment on attachment 798641 [details] [diff] [review]
> interp-cleanup-branch.patch
> 
> Review of attachment 798641 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/vm/Interpreter.cpp
> @@ +1303,5 @@
> >  
> >  #define BRANCH(n)                                                             \
> >      JS_BEGIN_MACRO                                                            \
> > +        int32_t nlen = (n);                                                   \
> > +        regs.pc += (nlen);                                                    \
> 
> does nlen need the parens?

Nope.

https://hg.mozilla.org/integration/mozilla-inbound/rev/ebb5b0609fe6
https://hg.mozilla.org/integration/mozilla-inbound/rev/6df80a4fdb06
https://hg.mozilla.org/integration/mozilla-inbound/rev/d09951d9e0c0
https://hg.mozilla.org/integration/mozilla-inbound/rev/7792dc26b3e1
https://hg.mozilla.org/integration/mozilla-inbound/rev/0e876da27431
https://hg.mozilla.org/integration/mozilla-inbound/rev/bf286f1d5489

Ed Morley [:emorley]

Comment 28

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/ebb5b0609fe6
https://hg.mozilla.org/mozilla-central/rev/6df80a4fdb06
https://hg.mozilla.org/mozilla-central/rev/d09951d9e0c0
https://hg.mozilla.org/mozilla-central/rev/7792dc26b3e1
https://hg.mozilla.org/mozilla-central/rev/0e876da27431
https://hg.mozilla.org/mozilla-central/rev/bf286f1d5489

Dan Gohman [:sunfish]

Assignee

Comment 29

•

11 years ago

Attached patch interp-prep-do-macros.patch — Details — Splinter Review

Attachment #797345 - Attachment is obsolete: true

Dan Gohman [:sunfish]

Assignee

Comment 30

•

11 years ago

Attached patch interp-prep-advance-and-dispatch.patch — Details — Splinter Review

Attachment #797346 - Attachment is obsolete: true

Dan Gohman [:sunfish]

Assignee

Comment 31

•

11 years ago

Attached patch interp-cleanup-case-macro-names.patch — Details — Splinter Review

Dan Gohman [:sunfish]

Assignee

Comment 32

•

11 years ago

Attached patch interp-cleanup-cases.patch — Details — Splinter Review

Dan Gohman [:sunfish]

Assignee

Comment 33

•

11 years ago

Attached patch interp-cleanup-var.patch — Details — Splinter Review

Dan Gohman [:sunfish]

Assignee

Comment 34

•

11 years ago

Attached patch interp-indirect-goto.patch — Details — Splinter Review

Attachment #797348 - Attachment is obsolete: true

Dan Gohman [:sunfish]

Assignee

Comment 35

•

11 years ago

Attached patch interp-indirection.patch — Details — Splinter Review

This patch moves the interpreter's FrameGuard, FrameRegs, and opMask variables into the InterpreterActivation, allowing InterpreterActivation's methods  and other code to access them directly rather than through an extra pointer indirection.

Dan Gohman [:sunfish]

Assignee

Comment 36

•

11 years ago

Attached patch interp-cleanup-interpret.patch — Details — Splinter Review

Dan Gohman [:sunfish]

Assignee

Comment 37

•

11 years ago

Attached patch interp-micro-mask.patch — Details — Splinter Review

Dan Gohman [:sunfish]

Assignee

Comment 38

•

11 years ago

For straight-line code, parser speed dwarfs interpreter speed. For loops above the baseline threshold, baseline/Ion take over for the win. But there is a small window around and below the baseline threshold where code runs in the interpreter where these patches often provide 10-15% speedups.

I've added several additional cleanup patches, and have further simplified the switch/goto macros, such that the maintenance burden of the threaded interpreter code should be quite low. There's only one #if, and it's pretty small and simple.

Also worth noting is bug 922440, where there is discussion of increasing the thresholds for the JITs on b2g, if not outright disabling them, either of which would put significantly more pressure on the interpreter. I haven't measured the performance of this patch on b2g, but a speedup is not unlikely.

Dan Gohman [:sunfish]

Assignee

Comment 39

•

11 years ago

Comment on attachment 821818 [details] [diff] [review]
interp-cleanup-case-macro-names.patch

This is just a simple rename; CASE instead of BEGIN_CASE because it really is just a case (and it's just a label in the indirect-goto version).

Attachment #821818 - Flags: review?(luke)

Dan Gohman [:sunfish]

Assignee

Comment 40

•

11 years ago

Comment on attachment 821819 [details] [diff] [review]
interp-cleanup-cases.patch

This adds some missing opcodes, and makes brace usage consistent.

Attachment #821819 - Flags: review?(luke)

Dan Gohman [:sunfish]

Assignee

Comment 41

•

11 years ago

Comment on attachment 821821 [details] [diff] [review]
interp-cleanup-var.patch

This cleans up what appears to be an obsolete optimization.

Attachment #821821 - Flags: review?(luke)

Dan Gohman [:sunfish]

Assignee

Comment 42

•

11 years ago

Comment on attachment 821748 [details] [diff] [review]
interp-prep-do-macros.patch

Ok. This patch is not strictly a cleanup; it's the first step towards re-enabling the threaded interpreter.

For SunSpider on a 2012 Nexus 7 (quad-core Cortex-A9) with --no-baseline, I see a pretty clear 10% speedup overall with this patch series. Some benchmarks see more, some less, but I don't see any slowdowns. With baseline enabled, I don't see a speedup, though I don't see any slowdowns either. I see similar results on an Intel laptop and an Intel desktop machine.

I am proposing this now because:
 - With all the cleanups, the complexity of this code is significantly lower than what was removed in bug 799777, and I claim the maintenance burden is significantly reduced.
 - In situations where the interpreter is used, either because the JIT is expensive, possibly including bug 922440, or not available for some reason, it's a pretty clear 10%-ish win on every platform I'm able to test on, excluding Windows where the optimization is currently unavailable.

Attachment #821748 - Flags: review?(luke)

Tom S [:evilpie]

Comment 43

•

11 years ago

I think somebody was using SpiderMonkey on iOS, which forces them to just use the interpreter. This would probably make them quite happy.

Luke Wagner [:luke]

Updated

•

11 years ago

Attachment #821818 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Updated

•

11 years ago

Attachment #821748 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Updated

•

11 years ago

Attachment #821819 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Comment 44

•

11 years ago

Comment on attachment 821821 [details] [diff] [review]
interp-cleanup-var.patch

Hehe, real hot.

Attachment #821821 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Comment 45

•

11 years ago

(In reply to Dan Gohman [:sunfish] from comment #42)
I'm down.

Dan Gohman [:sunfish]

Assignee

Comment 46

•

11 years ago

Comment on attachment 821817 [details] [diff] [review]
interp-prep-advance-and-dispatch.patch

This is the next preparatory patch. Instead of setting len and then doing ADVANCE_AND_OP_OP(), this passes the length to the macro. It also renames the macro to ADVANCE_AND_DISPATCH.

Attachment #821817 - Flags: review?(luke)

Dan Gohman [:sunfish]

Assignee

Comment 47

•

11 years ago

Comment on attachment 821822 [details] [diff] [review]
interp-indirect-goto.patch

This is the main indirect goto patch. A few notes:

It eliminates the "op" variable, which was just a copy of *regs.pc. In theory this was an optimization, but it doesn't appear to be very important, and it does make the code more complex, since op and *regs.pc have to be manually kept in sync.

It changes the increment of regs.pc from being in a single location at advanceAndDoOp at the top of the interpreter loop to being duplicated at every ADVANCE_AND_DISPATCH. This is theoretically a minor pessimization for the case where a switch is used instead of indirect gotos, but this doesn't appear to be very significant.

Attachment #821822 - Flags: review?(luke)

Luke Wagner [:luke]

Updated

•

11 years ago

Attachment #821817 - Flags: review?(luke) → review+

Dan Gohman [:sunfish]

Assignee

Comment 48

•

11 years ago

Comment on attachment 821827 [details] [diff] [review]
interp-indirection.patch

This patch folds the FrameGuard into InterpreterActivation and moves the interpreter's FrameRegs into InterpreterActivation too. And it fold in the opMask variable as well. The main effect of this is to eliminate InterpreterActivation's pointers to the FrameRegs, FrameGuard, and opMask, and FrameGuard's pointers. More code can access the data it needs without that indirection, and this results in about a 1% speedup of interpreter speed on Intel.

The main downside is the replacement of regs by the REGS macro in most of the interpreter body.

Attachment #821827 - Flags: review?(luke)

Dan Gohman [:sunfish]

Assignee

Comment 49

•

11 years ago

Comment on attachment 821829 [details] [diff] [review]
interp-cleanup-interpret.patch

This just reorganized a few things at the top of the Interpreter function.

Attachment #821829 - Flags: review?(luke)

Dan Gohman [:sunfish]

Assignee

Comment 50

•

11 years ago

Comment on attachment 821830 [details] [diff] [review]
interp-micro-mask.patch

This is a little optimization to finish things off. jsbytecode is a uint8_t, and this makes it easier for compilers to eliminate the zero-extend to size_t after or-ing the mask in. Since this happens after each opcode before the dispatch to the next opcode, this is actually noticeable.

Attachment #821830 - Flags: review?(luke)

Luke Wagner [:luke]

Comment 51

•

11 years ago

Comment on attachment 821822 [details] [diff] [review]
interp-indirect-goto.patch

This is much nicer than it was before.

Attachment #821822 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Comment 52

•

11 years ago

Comment on attachment 821827 [details] [diff] [review]
interp-indirection.patch

Nice!

Attachment #821827 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Updated

•

11 years ago

Attachment #821829 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Updated

•

11 years ago

Attachment #821830 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Comment 53

•

11 years ago

I was looking at this all a bit more and I had two few questions:

- Why did you decide to store offsets instead of absolute addresses of labels?  Even if x86 can (I assume) fold in the displacement of &LABEL(JSOP_NOP), it still seems like an extra 4-byte immediate per jump instruction.  Is there a downside to storing the addresses of the labels directly?  It seems like it requires sizeof(void*) instead of sizeof(uint32_t) which could help on 64-bit.  Another possibility is if the compiler was able to do a better job baking in the constant offsets than constant addresses of labels, but I wouldn't think so.

- By having 'offsets' be a static const, is the compiler able to bake in a constant address?  It seems like, in the best case, DISPATCH_TO would be a single
  jmp *(ADDRESS_OF_OFFSETS + op<<2)
instruction.  Is this what you see in the generated code?

Dan Gohman [:sunfish]

Assignee

Comment 54

•

11 years ago

(In reply to Luke Wagner [:luke] from comment #53)
> I was looking at this all a bit more and I had two few questions:
> 
> - Why did you decide to store offsets instead of absolute addresses of
> labels?  Even if x86 can (I assume) fold in the displacement of
> &LABEL(JSOP_NOP), it still seems like an extra 4-byte immediate per jump
> instruction.  Is there a downside to storing the addresses of the labels
> directly?  It seems like it requires sizeof(void*) instead of
> sizeof(uint32_t) which could help on 64-bit.  Another possibility is if the
> compiler was able to do a better job baking in the constant offsets than
> constant addresses of labels, but I wouldn't think so.

Absolute addresses in static initializers require load-time relocations in PIC mode. See the remark about shared libraries in GCC's manual:

[0] http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html

I'll add this info as a comment.

> - By having 'offsets' be a static const, is the compiler able to bake in a
> constant address?  It seems like, in the best case, DISPATCH_TO would be a
> single
>   jmp *(ADDRESS_OF_OFFSETS + op<<2)
> instruction.  Is this what you see in the generated code?

Offsets actually make it slightly slower at runtime. On x64, the sequence with offsets is:

  movslq (%r12,%rax,4),%rax
  addq   %r14,%rax
  jmpq   *%rax

It's easy enough to switch to absolute addresses if you prefer.

Luke Wagner [:luke]

Comment 55

•

11 years ago

(In reply to Dan Gohman [:sunfish] from comment #54)
> It's easy enough to switch to absolute addresses if you prefer.

I don't have any a priori preferences, I was just wanting to understand the choices.  Thanks, I fogot about PIC code.  Assuming we just eat the load-time cost, though, is there any noticeable speed difference using absolute addresses?

>   movslq (%r12,%rax,4),%rax
>   addq   %r14,%rax
>   jmpq   *%rax

I assume %r12 is the address of 'offsets' and %r14 is the LABEL(JSOP_NOP)?  If so, are these kept in registers or do most cases have to fill these registers from the stack or somewhere else?

Dan Gohman [:sunfish]

Assignee

Comment 56

•

11 years ago

(In reply to Luke Wagner [:luke] from comment #55)
> (In reply to Dan Gohman [:sunfish] from comment #54)
> > It's easy enough to switch to absolute addresses if you prefer.
> 
> I don't have any a priori preferences, I was just wanting to understand the
> choices.  Thanks, I fogot about PIC code.  Assuming we just eat the
> load-time cost, though, is there any noticeable speed difference using
> absolute addresses?

In some ad-hoc testing, absolute addresses appear to be about a 1% speedup at runtime.

> >   movslq (%r12,%rax,4),%rax
> >   addq   %r14,%rax
> >   jmpq   *%rax
> 
> I assume %r12 is the address of 'offsets' and %r14 is the LABEL(JSOP_NOP)? 

Yep.

> If so, are these kept in registers or do most cases have to fill these
> registers from the stack or somewhere else?

It looks like they're in registers most of the time. They get rematerialized in a few places, but most of the time they're preserved.

Dan Gohman [:sunfish]

Assignee

Comment 57

•

11 years ago

Attached patch interp-absolute.patch — Details — Splinter Review

Here's a patch which changes the interpreter to use absolute addresses.

On x86-64 Linux for example, it increases the number of R_X86_64_RELATIVE relocations in libmozjs-27.0a.so from 25709 to 25965. I hadn't realized it was already that high.  Given that, the added cost is possibly not very significant, so the 1% runtime speedup may be worth it.

Attachment #822913 - Flags: review?(luke)

Luke Wagner [:luke]

Comment 58

•

11 years ago

(In reply to Dan Gohman [:sunfish] from comment #56)
Thanks for looking into this!

> It looks like they're in registers most of the time. They get rematerialized
> in a few places, but most of the time they're preserved.

Wow, I'm impressed.  I mean, to know that 'offsets' and 'LABEL(JSOP_NOP)' are in registers at the beginning of a case, the compiler has to know that they are in registers, and the same registers at end of every case.  (Of course they are, since each indirect goto is using 'offsets' and 'LABEL(JSOP_NOP)', but still :)

Luke Wagner [:luke]

Comment 59

•

11 years ago

Comment on attachment 822913 [details] [diff] [review]
interp-absolute.patch

Review of attachment 822913 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/Interpreter.cpp
@@ +1220,4 @@
>  # define OPPAD(v)                                                             \
> +    (v) == EnableInterruptsPseudoOpcode ?                                     \
> +    LABEL(EnableInterruptsPseudoOpcode) :                                     \
> +    LABEL(default),

SM style is ? and : at the beginning of the line, aligned with the condition.

Alternatively, what if you added a third OPINT (in keeping with the 5-char macros :) that was only used for EnableInterruptsPseudoOpcode to avoid the ternary operator (which I realize should be const-evaluated; it might just read a bit better).

Attachment #822913 - Flags: review?(luke) → review+

Dan Gohman [:sunfish]

Assignee

Comment 60

•

11 years ago

(In reply to Luke Wagner [:luke] from comment #59)
> Comment on attachment 822913 [details] [diff] [review]
> interp-absolute.patch
> 
> Review of attachment 822913 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/vm/Interpreter.cpp
> @@ +1220,4 @@
> >  # define OPPAD(v)                                                             \
> > +    (v) == EnableInterruptsPseudoOpcode ?                                     \
> > +    LABEL(EnableInterruptsPseudoOpcode) :                                     \
> > +    LABEL(default),
> 
> SM style is ? and : at the beginning of the line, aligned with the condition.

Fixed. I also put parens around the whole thing.

> Alternatively, what if you added a third OPINT (in keeping with the 5-char
> macros :) that was only used for EnableInterruptsPseudoOpcode to avoid the
> ternary operator (which I realize should be const-evaluated; it might just
> read a bit better).

I considered this, but I also like keeping the Interpreter's implementation details out of jsopcode.tbl as much as possible. OPPAD is intrusion enough :-}.

remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/0f80724faca5
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/88e167fa6f4b
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/d4322df6c15e
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/c5186c7a40ac
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/af637e56ce12
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/fb2bf717ab24
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/953dc75f2bb9
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/a2074887deb6
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/9a5e42743781
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/9540960e67a2
~                                                                                        
~                                                                                        
~                                                                                        
~                                                                                        
~                                                                                        
~                                                                                        
~                                                                                        
~                                                                                        
~

Whiteboard: [leave open]

Phil Ringnalda (:philor)

Comment 61

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/0f80724faca5
https://hg.mozilla.org/mozilla-central/rev/88e167fa6f4b
https://hg.mozilla.org/mozilla-central/rev/d4322df6c15e
https://hg.mozilla.org/mozilla-central/rev/c5186c7a40ac
https://hg.mozilla.org/mozilla-central/rev/af637e56ce12
https://hg.mozilla.org/mozilla-central/rev/fb2bf717ab24
https://hg.mozilla.org/mozilla-central/rev/953dc75f2bb9
https://hg.mozilla.org/mozilla-central/rev/a2074887deb6
https://hg.mozilla.org/mozilla-central/rev/9a5e42743781
https://hg.mozilla.org/mozilla-central/rev/9540960e67a2

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla28

Ryan VanderMeulen [:RyanVM]

Updated

•

11 years ago

Blocks: 450320

Ioana (away)

Updated

•

10 years ago

Whiteboard: [qa-]

You need to log in before you can comment on or make changes to this bug.