Closed Bug 502778 Opened 16 years ago Closed 15 years ago

nanojit: speed up CseFilter

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Whiteboard: fixed-in-nanojit fixed-in-tracemonkey)

Attachments

(2 files, 1 obsolete file)

patch 16 years ago Nicholas Nethercote [inactive] 12.50 KB, patch	edwsmith : review+	Details \| Diff \| Splinter Review
completely overhauled patch 15 years ago Nicholas Nethercote [inactive] 31.64 KB, patch	edwsmith : review+	Details \| Diff \| Splinter Review
SunSpider results for patch v2 15 years ago Nicholas Nethercote [inactive] 3.52 KB, text/plain		Details

Nicholas Nethercote [inactive]

Assignee

Description

•

16 years ago

Attached patch patch (obsolete) — Details — Splinter Review

This patch splits the CseFilter's LInsHashSet into seven, one for each instruction group. This save somewhere between 3--6ms in SS. It also increases the starting size of these LInsHashSets from 64--256, which seems to help another couple of ms, but the difference is small enough that it's difficult to claim it's decisive. The patch also does a little bit of cleaning up: - renames some of the hash*() and find*() functions so that the suffixes match the ins*() functions. - avoids an unnecessary argument to hash() by computing hashcode() inside it - reorders add() so that the call to find() isn't necessary (it adds the item then checks for growth, so the find() within grow() covers the newly added element) - Removes CseReader, which is dead in TM and Tamarin - removes LInsHashSet::replace(); also dead I tried some further changes -- splitting add(), find() and grow() into more specific kinds (addImm(), addImmq(), etc) but it didn't help speed at all.

Attachment #387130 - Flags: review?(edwsmith)

Edwin Smith

Updated

•

16 years ago

Attachment #387130 - Flags: review?(edwsmith) → review+

Edwin Smith

Comment 1

•

16 years ago

Comment on attachment 387130 [details] [diff] [review] patch A comment in CseFilter for the rationale behind N tables vs 1 would be nice, otherwise looks great.

Nicholas Nethercote [inactive]

Assignee

Comment 2

•

15 years ago

Attached patch completely overhauled patch — Details — Splinter Review

The previous patch's effect somehow disappeared in the time since I first wrote it. Here is a new patch. It's more extensive. (Nb: If you read LIR.h changes before LIR.cpp changes it'll make more sense.) - The separation of one hash table into nine is now done within LInsHashSet rather than outside it. This was necessary for some of the subsequent changes. The initial size of each one can now be controlled independently from the others, which is good as some of them get more use than others. - Previously, we had a mixture of specialised and generic operations: findXYZ() was specialised for the instruction kind (Imm, Immq, etc) when inserting, but add()/find()/grow()/equals()/hashcode() were generic, working with any instruction kind. This was a bit sucky because certain operations were effectively coded twice, most notably find(), and if the implementations didn't match then problems would occur. Also, we had to do switches in various places, most notably hashcode() and equals(). The patch gets rid of find(), equals() and hashcode() by making add() and grow() specialised for the instruction kind (simply by taking a LInsKind argument, which then causes the appropriate findXYZ() function to be used). This required introducing additional findXYZ() wrapper functions that take a LInsp argument, and putting them the m_find[] array. - Now handling insImmf() separately from insImmq(). - Made variable naming more consistent: 'ins' for instructions, 'k' for hashtable indices. And used the LInsKind names (eg. "Imm", "Immq", "Call") more consistently (eg. avoided "imm", "call"). As well as improving the code's readability, it's a performance win because the switches are avoided in hashcode() and equals(), and because the hash tables are split and sized appropriately for the instruction kinds which means that grow() is called about 1/2 as often. I'm currently getting about a 4ms SS speedup. This might be optimistic -- some of the tests (eg. 3d-morph) have sped up surprisingly -- but the tests that should speed up have (crypto-md5, 3d-raytrace, date-format-xparb) and I've confirmed with Cachegrind that fewer instructions are being run so it's a clear win.

Attachment #387130 - Attachment is obsolete: true

Attachment #399129 - Flags: review?(edwsmith)

Nicholas Nethercote [inactive]

Assignee

Comment 3

•

15 years ago

Attached file SunSpider results for patch v2 — Details

Nicholas Nethercote [inactive]

Assignee

Updated

•

15 years ago

Blocks: 505662

Edwin Smith

Comment 4

•

15 years ago

Comment on attachment 399129 [details] [diff] [review] completely overhauled patch it would be nice to allow kInitialCaps[kind] to be zero. a growth formula like this might work: newsize = oldsize + oldsize/2 + K where K is a largish initial bump, and the overall 1.5x growth factor has less waste when the tables get big (and the math works nice for an initial size of 0)

Attachment #399129 - Flags: review?(edwsmith) → review+

Nicholas Nethercote [inactive]

Assignee

Comment 5

•

15 years ago

Ed, the current hashing algorithm requires that the table sizes are multiples of two, because they use a bitmask. Keeping multiples of two seems simpler.

Nicholas Nethercote [inactive]

Assignee

Updated

•

15 years ago

Blocks: 524632

Nicholas Nethercote [inactive]

Assignee

Comment 6

•

15 years ago

http://hg.mozilla.org/projects/nanojit-central/rev/98b64e95f3f9

Whiteboard: fixed-in-nanojit

David Anderson [:dvander] - inactive, e-mail if emergency

Comment 7

•

15 years ago

http://hg.mozilla.org/tracemonkey/rev/1f4284d6be93

Whiteboard: fixed-in-nanojit → fixed-in-nanojit fixed-in-tracemonkey

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Updated

•

15 years ago

Depends on: 527288

Rick Reitmaier

Comment 8

•

15 years ago

This code is failing on tamarin when SOFTFLOAT is defined (e.g. under winmo). In LInsHashSet.grow() the loop over oldList there is a LIR_qjoin entry. I would imagine that this is incorrect and we should only have call's in the list. It appears that the SoftFloatFilter is interacting poorly with this new cse filter (i.e. moving SoftFloatFilter downstream of the cse filter cures these symptoms). Will require further investigation as moving the filter is not a solution, it will effectively disable optimization of any SoftFloatFilter changes.

Nicholas Nethercote [inactive]

Assignee

Comment 9

•

15 years ago

Rick, the patch wasn't supposed to change functional behaviour of the CseFilter at all. But it did. Can you try the patch in bug 527288?

Rick Reitmaier

Comment 10

•

15 years ago

still see the issue post-patch.

Rick Reitmaier

Updated

•

15 years ago

Blocks: 527754

Nicholas Nethercote [inactive]

Assignee

Comment 11

•

15 years ago

(In reply to comment #10) > still see the issue post-patch. Bug 527754 is the follow-up bug for the SoftFloat issue.

Robert Sayre

Comment 12

•

15 years ago

http://hg.mozilla.org/mozilla-central/rev/1f4284d6be93

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

nanojit: speed up CseFilter

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Whiteboard: fixed-in-nanojit fixed-in-tracemonkey)

Crash Data

Security

(public)

User Story

Attachments

(2 files, 1 obsolete file)

Description

Updated

Comment 1

Comment 2

Comment 3

Updated

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Updated

Comment 8

Comment 9

Comment 10

Updated

Comment 11

Comment 12

Attachment

General

Description

File Name

Content Type