Closed Bug 1366241 Opened 8 years ago Closed 8 years ago

Use of BinarySearch is slow in nsHtml5ElementName::elementNameByBuffer and nsHTML5AttributeName::nameByBuffer

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla56

Project Flags:

Performance Impact

high

Tracking Flags:

Tracking

Status

firefox56

---

fixed

People

(Reporter: smaug, Assigned: wchen)

References

(Depends on 1 open bug)

Details

Attachments

(2 files)

Java changes - Change memory layout of element name and attribute name hashes in HTML parser from sorted to level order BST in order to take advantage of cache during lookup. 8 years ago William Chen [:wchen] 50.54 KB, patch	hsivonen : review+	Details \| Diff \| Splinter Review
cpp changes - Change memory layout of element name and attribute name hashes in HTML parser from sorted to level order BST in order to take advantage of cache during lookup 8 years ago William Chen [:wchen] 89.08 KB, patch		Details \| Diff \| Splinter Review

Olli Pettay [:smaug][bugs@pettay.fi]

Reporter

Description

•

8 years ago

See https://bugzilla.mozilla.org/show_bug.cgi?id=1351857#c16 and comments onward. qf:p1 is perhaps a bit too high priority, but this shouldn't be too difficult to fix and could make us faster than Chrome.

Olli Pettay [:smaug][bugs@pettay.fi]

Reporter

Comment 1

•

8 years ago

... in Bug 1347525

Hsin-Yi Tsai (she/her) [:hsinyi]

Comment 2

•

8 years ago

Henri, would you be able to take this?

Flags: needinfo?(hsivonen)

Henri Sivonen (:hsivonen)

Comment 3

•

8 years ago

(In reply to Hsin-Yi Tsai [:hsinyi] from comment #2) > Henri, would you be able to take this? Yes, but it's unclear to me what should be done here. The right thing to do would be to come up with a perfect hash function, but I'm unaware of suitable code generators. The alternative is to have a cache for making particular benchmarks look good. I guess I'll try to find a suitable code generator.

Flags: needinfo?(hsivonen)

Olli Pettay [:smaug][bugs@pettay.fi]

Reporter

Comment 4

•

8 years ago

It is not just that particular benchmark. It should catch common cases usually. We use similar caches in many places (and have used for a long time) and they tend to work rather well. Perfect hash would be of course the best option.

Henri Sivonen (:hsivonen)

Comment 5

•

8 years ago

Let's see if StackOverflow yields the needed code generator: https://stackoverflow.com/questions/44133839/minimal-perfect-hash-function-code-generator-for-java

Hsin-Yi Tsai (she/her) [:hsinyi]

Updated

•

8 years ago

Priority: -- → P2

Kan-Ru Chen [:kanru] (UTC+9)

Updated

•

8 years ago

Blocks: qf-bugs-upforgrabs

Andrew Overholt [:overholt]

Comment 6

•

8 years ago

Seems like Henri's working on this.

Assignee: nobody → hsivonen

Andrew Overholt [:overholt]

Updated

•

8 years ago

No longer blocks: qf-bugs-upforgrabs

(no longer active)

Comment 7

•

8 years ago

If I'm not mistaken, the ELEMENT_HASHES table has 204 entries which means that it won't fit in a cache line. Doing a binary search over it in the usual way is pretty cache hostile. Over the weekend I did some research over what cache friendly binary search algorithms are available, and it turns out that with a bit smarter arrangement of the entries in the array instead of just sorting them, you can achieve much better cache efficiency rates while performing the binary search, by mutating the access patterns in order to transform them into linear advances into the array instead of randomly jumping around over the array which a normal binary search over the array would achieve. Furthermore, it makes it so that the first few accesses into the array for the first few steps of the binary search happen close to each other, increasing the likelihood that after the first access, the next few will be in the fetched cache line. This blog post does a great job at explaining the concept with some helpful graphs: http://bannalia.blogspot.ca/2015/06/cache-friendly-binary-search.html This recent survey looks very interesting: https://arxiv.org/abs/1509.05053 Based on this, and given that this data is sorted statically, perhaps we should consider switching it to Eytzinger sorting order?

(no longer active)

Updated

•

8 years ago

Depends on: 1371043

Andrew Overholt [:overholt]

Updated

•

8 years ago

Assignee: hsivonen → wchen

William Chen [:wchen]

Assignee

Comment 8

•

8 years ago

I changed the memory layout of the hashes for both element names and attribute names and profiled against the testcases in bug 1347525. Without the patch, the time spent in nsHtml5ElementName::elementNameByBuffer and nsHTML5AttributeName::nameByBuffer is approximately 8% of FragmentOrElement::SetInnerHTMLInternal. After the patch, it's approximately 4%.

William Chen [:wchen]

Assignee

Comment 9

•

8 years ago

Attached patch Java changes - Change memory layout of element name and attribute name hashes in HTML parser from sorted to level order BST in order to take advantage of cache during lookup. — Details — Splinter Review

Attachment #8880240 - Flags: review?(hsivonen)

William Chen [:wchen]

Assignee

Comment 10

•

8 years ago

Attached patch cpp changes - Change memory layout of element name and attribute name hashes in HTML parser from sorted to level order BST in order to take advantage of cache during lookup — Details — Splinter Review

Henri Sivonen (:hsivonen)

Comment 11

•

8 years ago

Comment on attachment 8880241 [details] [diff] [review] cpp changes - Change memory layout of element name and attribute name hashes in HTML parser from sorted to level order BST in order to take advantage of cache during lookup I looks like this is not the intended attachment.

Henri Sivonen (:hsivonen)

Comment 12

•

8 years ago

Comment on attachment 8880240 [details] [diff] [review] Java changes - Change memory layout of element name and attribute name hashes in HTML parser from sorted to level order BST in order to take advantage of cache during lookup. Review of attachment 8880240 [details] [diff] [review]: ----------------------------------------------------------------- Thank you. r+ even if the nit isn't fixed, but I'd prefer the nit to be fixed unless there's a translation reason not to. ::: src/nu/validator/htmlparser/impl/AttributeName.java @@ +297,5 @@ > // XXX deal with offset > @Unsigned int hash = AttributeName.bufToHash(buf, length); > + int[] hashes; > + hashes = AttributeName.ATTRIBUTE_HASHES; > + int index = levelOrderBinarySearch(hashes, hash); Why isn't AttributeName.ATTRIBUTE_HASHES passed directly to levelOrderBinarySearch? Why is int[] hashes declared alone on a separate line? If these oddities are needed for translator quirks, they are OK. Otherwise, it would be nicer to pass AttributeName.ATTRIBUTE_HASHES to levelOrderBinarySearch directly.

Attachment #8880240 - Flags: review?(hsivonen) → review+

Pulsebot

Comment 13

•

8 years ago

Pushed by wchen@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/b40afe5dca11 Change memory layout of element name and attribute name hashes in HTML parser from sorted to level order BST in order to take advantage of cache during lookup. r=hsivonen

William Chen [:wchen]

Assignee

Comment 14

•

8 years ago

(In reply to Henri Sivonen (:hsivonen) from comment #12) > Comment on attachment 8880240 [details] [diff] [review] > Java changes - Change memory layout of element name and attribute name > hashes in HTML parser from sorted to level order BST in order to take > advantage of cache during lookup. > > Review of attachment 8880240 [details] [diff] [review]: > ----------------------------------------------------------------- > > Thank you. r+ even if the nit isn't fixed, but I'd prefer the nit to be > fixed unless there's a translation reason not to. > > ::: src/nu/validator/htmlparser/impl/AttributeName.java > @@ +297,5 @@ > > // XXX deal with offset > > @Unsigned int hash = AttributeName.bufToHash(buf, length); > > + int[] hashes; > > + hashes = AttributeName.ATTRIBUTE_HASHES; > > + int index = levelOrderBinarySearch(hashes, hash); > > Why isn't AttributeName.ATTRIBUTE_HASHES passed directly to > levelOrderBinarySearch? Why is int[] hashes declared alone on a separate > line? If these oddities are needed for translator quirks, they are OK. > Otherwise, it would be nicer to pass AttributeName.ATTRIBUTE_HASHES to > levelOrderBinarySearch directly. Yeah, it's due to a translator quirk.

Carsten Book [:Tomcat]

Comment 15

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/b40afe5dca11

Status: NEW → RESOLVED

Closed: 8 years ago

status-firefox56: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla56

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

3 years ago

Performance Impact: --- → P1

Whiteboard: [qf:p1]

You need to log in before you can comment on or make changes to this bug.