Closed Bug 1171842 Opened 9 years ago Closed 9 years ago

Use jump table to replace the nested if statements in style struct list

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla41

Tracking Flags:

Tracking

Status

firefox41

---

fixed

People

(Reporter: xidorn, Assigned: xidorn)

Details

Attachments

(2 files, 1 obsolete file)

patch 9 years ago Xidorn Quan [:xidorn] UTC+11 8.40 KB, patch	dbaron : review+	Details \| Diff \| Splinter Review
walktree_pref.zip 9 years ago Xidorn Quan [:xidorn] UTC+11 3.93 KB, application/zip		Details
walktree_pref.zip 9 years ago Xidorn Quan [:xidorn] UTC+11 3.98 KB, application/x-zip-compressed		Details

Xidorn Quan [:xidorn] UTC+11

Assignee

Description

•

9 years ago

Currently, we generate a tree of nested if statements to select compute function for a given SID. We should replace it with a method array of compute functions, and use a indirect call for it.

The main advantage of this change is that it should improve the performance. Multiple if statements have significantly negative effect on branch prediction. Using the method array could simply remove all of those conditions. It should also reduce the size of both the generated code and the final binary.

Xidorn Quan [:xidorn] UTC+11

Assignee

Comment 1

•

9 years ago

Attached patch patch — Details — Splinter Review

Attachment #8615860 - Flags: review?(dbaron)

Xidorn Quan [:xidorn] UTC+11

Assignee

Comment 2

•

9 years ago

Seems to be the reverse of bug 210550 :)

It is said that jump table hurts pipeline and causes cache misses, but shouldn't failure of branch prediction hurt more performance?

Intel replaced the big switch in Python bytecode interpreter with a jump table, and saw 15%-20% performance gain, which is from the removal of one single condition check. [1]

Hmmm, but well, three condition vs. indirect function call. I'm not sure which is better. But as the pipeline of processors becoming longer, I suppose branch prediction failure should cost more.

[1] http://eli.thegreenplace.net/2012/07/12/computed-goto-for-efficient-dispatch-tables

Summary: Use method array to replace the nested if statements in style struct list → Use jump table to replace the nested if statements in style struct list

Xidorn Quan [:xidorn] UTC+11

Assignee

Comment 3

•

9 years ago

It seems the main cost of an indirect call is that the processor needs to wait until the target is known. But the compiler should be able to rearrange the code and the processor could also do some out-of-order execution to mitigate this cost. If they can't, we could simply move the getting code to the beginning of WalkRuleTree function.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 4

•

9 years ago

Comment on attachment 8615860 [details] [diff] [review]
patch

Please test the performance, and re-request review if it actually *is* faster.

Attachment #8615860 - Flags: review?(dbaron)

Xidorn Quan [:xidorn] UTC+11

Assignee

Comment 5

•

9 years ago

Attached file walktree_pref.zip (obsolete) — Details

You can run this pref test by executing "make" in the extracted directory. The column after "original: " is the time for the current if-tree style. The column after "new: " is the time for the jump table style.

In my machine, this test shows that the new method is at least ~17% faster than the if-tree style.

Xidorn Quan [:xidorn] UTC+11

Assignee

Updated

•

9 years ago

Attachment #8615860 - Flags: review?(dbaron)

Xidorn Quan [:xidorn] UTC+11

Assignee

Comment 6

•

9 years ago

If I move the rand() call out from the inner loops, via either moving that to the outer loop or initializing a random table at the very beginning, the test shows the jump table is ~40% faster than if-tree on my machine.

Xidorn Quan [:xidorn] UTC+11

Assignee

Comment 7

•

9 years ago

Attached file walktree_pref.zip — Details

Attachment #8617104 - Attachment is obsolete: true

Xidorn Quan [:xidorn] UTC+11

Assignee

Comment 8

•

9 years ago

Note that if LTO is enabled for this pref test, the if-tree one would take advantage, as the compiler will inline the target Compute*Data methods and then strip the whole WalkRuleTree1 method to a nop since nothing is actually done in it, while the compiler cannot reason out the same optimization for WalkRuleTree2.

However, if you add any code in the Compute*Data methods, even if the compiler can inline those methods and thus remove one function call for if-tree, jump table is still ~10% faster. As our Compute*Data methods are generally long enough that the compiler usually won't inline them, jump table is much faster.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

9 years ago

Attachment #8615860 - Flags: review?(dbaron) → review+

Pulsebot

Comment 9

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/2b37fea15848

Ryan VanderMeulen [:RyanVM]

Comment 10

•

9 years ago

https://hg.mozilla.org/mozilla-central/rev/2b37fea15848

Assignee: nobody → quanxunzhen

Status: NEW → RESOLVED

Closed: 9 years ago

status-firefox41: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla41

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Use jump table to replace the nested if statements in style struct list

Categories

(Core :: CSS Parsing and Computation, defect)

Tracking

()

People

(Reporter: xidorn, Assigned: xidorn)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(2 files, 1 obsolete file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Comment 8

Updated

Comment 9

Comment 10

Attachment

General

Description

File Name

Content Type