Closed Bug 1061339 Opened 11 years ago Closed 11 years ago

Enable AVX2 in libvpx for VS2012+

Tracking

(firefox38 fixed)

Status:

RESOLVED FIXED

Milestone:

mozilla38

Tracking Flags:

Tracking

Status

firefox38

---

fixed

People

(Reporter: away, Assigned: m_kato)

Details

Attachments

(2 files, 4 obsolete files)

Bug 1061339 - Enable AVX2 in libvpx for VS2012+ 11 years ago xunxun 929 bytes, patch		Details \| Diff \| Splinter Review
Enable AVX2 in libvpx for VS2012+ 11 years ago xunxun 795 bytes, patch		Details \| Diff \| Splinter Review
Add vs2012+ configuration 11 years ago Makoto Kato [:m_kato] 15.62 KB, patch	rillian : review+ gps : review+	Details \| Diff \| Splinter Review
Part 1. Always use VS2013 target 11 years ago Makoto Kato [:m_kato] 101.99 KB, patch	rillian : review+	Details \| Diff \| Splinter Review
Part 2. Build AVX code on all Windows build 11 years ago Makoto Kato [:m_kato] 933 bytes, patch		Details \| Diff \| Splinter Review
Part 2. Build AVX code on all Windows build 11 years ago Makoto Kato [:m_kato] 933 bytes, patch	rillian : review+ ted : review+	Details \| Diff \| Splinter Review

(Away)

Reporter

Description

•

11 years ago

This was already done for libyuv in bug 1054811. Let's do the same for libvpx. (I don't see any other uses besides those two libs)

xunxun

Comment 1

•

11 years ago

Attached patch Bug 1061339 - Enable AVX2 in libvpx for VS2012+ (obsolete) — Details — Splinter Review

I think the patch is very easy.

Attachment #8482628 - Flags: review?(ted)

xunxun

Comment 2

•

11 years ago

Attached patch Enable AVX2 in libvpx for VS2012+ (obsolete) — Details — Splinter Review

I think HAVE_X86_AVX2 is enough

Attachment #8482628 - Attachment is obsolete: true

Attachment #8482628 - Flags: review?(ted)

Attachment #8482755 - Flags: review?(ted)

Makoto Kato [:m_kato]

Assignee

Comment 3

•

11 years ago

(In reply to xunxun from comment #2) > Created attachment 8482755 [details] [diff] [review] > Enable AVX2 in libvpx for VS2012+ > > I think HAVE_X86_AVX2 is enough Not enouth. You have to generate all headers such as vp9_rtcd_x86-win32-vs11.h by original source code.

Makoto Kato [:m_kato]

Assignee

Comment 4

•

11 years ago

- configure into libvpx-1.3.0 can generate vpx_config.* for that platform. - make after configure can generate vp9_rtcd.h, vp8_rtcd.h and vpx_scale_rtcd.h for that platform - include these by vp8_rtcd.h, vp9_rtcd.h and vpx_config.asm and vpx_config.h generated vp9_rtcd.h for VS2012+ will call AVX2 code. So your patch never calls AVX2's intrinsic.

Makoto Kato [:m_kato]

Assignee

Comment 5

•

11 years ago

also, vpx_rtcd.h is't used now. It is bug 1061538

xunxun

Comment 6

•

11 years ago

(In reply to Makoto Kato (:m_kato) from comment #4) > - configure into libvpx-1.3.0 can generate vpx_config.* for that platform. > - make after configure can generate vp9_rtcd.h, vp8_rtcd.h and > vpx_scale_rtcd.h for that platform > - include these by vp8_rtcd.h, vp9_rtcd.h and vpx_config.asm and vpx_config.h > > generated vp9_rtcd.h for VS2012+ will call AVX2 code. So your patch never > calls AVX2's intrinsic. Can you take the bug? I found that Mozilla in-tree libvpx 1.3.0 also lack vp9/encoder/x86/*avx2.c, and I have no AVX2 CPU, I can't test it.

Makoto Kato [:m_kato]

Assignee

Updated

•

11 years ago

Attachment #8482755 - Attachment is obsolete: true

Attachment #8482755 - Flags: review?(ted)

Makoto Kato [:m_kato]

Assignee

Comment 7

•

11 years ago

(In reply to xunxun from comment #6) > I found that Mozilla in-tree libvpx 1.3.0 also lack vp9/encoder/x86/*avx2.c, > and I have no AVX2 CPU, I can't test it. That files are upstream only.

Makoto Kato [:m_kato]

Assignee

Comment 8

•

11 years ago

Attached patch Add vs2012+ configuration (obsolete) — Details — Splinter Review

Makoto Kato [:m_kato]

Assignee

Updated

•

11 years ago

Attachment #8483995 - Flags: review?(giles)

Ralph Giles (:rillian)

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/f8e2f5c99ede

:Ms2ger (he/him; ⌚ UTC+1/+2)

Comment 12

•

11 years ago

Makefile.in changes require build peer review. Please request post-facto review or backout.

Flags: needinfo?(m_kato)

Flags: needinfo?(giles)

Makoto Kato [:m_kato]

Assignee

Comment 13

•

11 years ago

OK. I will backout soon.

Flags: needinfo?(m_kato)

Flags: needinfo?(giles)

Makoto Kato [:m_kato]

Assignee

Comment 14

•

11 years ago

backed out https://hg.mozilla.org/integration/mozilla-inbound/rev/a83356099127

Makoto Kato [:m_kato]

Assignee

Comment 15

•

11 years ago

Comment on attachment 8483995 [details] [diff] [review] Add vs2012+ configuration No way to detect visual studio version on yasm. So I want to add vc version for condition of yasm assembler.

Attachment #8483995 - Flags: review?(gps)

Makoto Kato [:m_kato]

Assignee

•

11 years ago

Attached patch Part 1. Always use VS2013 target — Details — Splinter Review

Attachment #8483995 - Attachment is obsolete: true

Makoto Kato [:m_kato]

Assignee

Comment 22

•

11 years ago

Attached patch Part 2. Build AVX code on all Windows build (obsolete) — Details — Splinter Review

Makoto Kato [:m_kato]

Assignee

Comment 23

•

11 years ago

Comment on attachment 8565275 [details] [diff] [review] Part 1. Always use VS2013 target Minimal requirement of Windows MSVS build is 2013. So we should use x86-win32-vs12 and x86_64-win64-vs12 as target.

Attachment #8565275 - Flags: review?(giles)

Makoto Kato [:m_kato]

Assignee

Updated

•

11 years ago

Attachment #8565304 - Attachment is obsolete: true

Makoto Kato [:m_kato]

Assignee

Comment 24

•

11 years ago

Attached patch Part 2. Build AVX code on all Windows build — Details — Splinter Review

Attachment #8565305 - Flags: review?(giles)

Ralph Giles (:rillian)

Comment 25

•

11 years ago

Comment on attachment 8565275 [details] [diff] [review] Part 1. Always use VS2013 target Review of attachment 8565275 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for the clean update.

Attachment #8565275 - Flags: review?(giles) → review+

Ralph Giles (:rillian)

Comment 26

•

11 years ago

Comment on attachment 8565305 [details] [diff] [review] Part 2. Build AVX code on all Windows build Works for me. Adding ted for build peer review.

Attachment #8565305 - Flags: review?(ted)

Attachment #8565305 - Flags: review?(giles)

Attachment #8565305 - Flags: review+

(not currently active) Ted Mielczarek

Comment 27

•

11 years ago

Comment on attachment 8565305 [details] [diff] [review] Part 2. Build AVX code on all Windows build Review of attachment 8565305 [details] [diff] [review]: ----------------------------------------------------------------- ::: media/libvpx/moz.build @@ +42,5 @@ > SOURCES += files['X86-64_ASM'] > > # AVX2 only supported on > + # Darwin and Windows toolchains right now > + if CONFIG['OS_TARGET'] in ('Darwin', 'WINNT'): You can actually use a set literal for this, not that it makes much difference, that'd be {'Darwin', 'WINNT'}

Attachment #8565305 - Flags: review?(ted) → review+

Ralph Giles (:rillian)

Comment 28

•

11 years ago

To clarify, searching set literals is in general faster than searching tuples, since there's a hash lookup function.

Mike Hommey [:glandium]

Comment 29

•

11 years ago

(In reply to Ralph Giles (:rillian) from comment #28) > To clarify, searching set literals is in general faster than searching > tuples, since there's a hash lookup function. That's a very common misconception that should be beaten to death. A hash lookup in a small hash table is generally slower than a linear lookup in a small array. In python, it's an order of magnitude slower.

Ralph Giles (:rillian)

Comment 30

•

11 years ago

(In reply to Mike Hommey [:glandium] from comment #29) > That's a very common misconception $ python -m timeit "'hello' in ('hello', 'goodbye')" 10000000 loops, best of 3: 0.035 usec per loop $ python -m timeit "'hello' in ('goodbye', 'hello')" 10000000 loops, best of 3: 0.045 usec per loop $ python -m timeit "'hello' in {'hello', 'goodbye'}" 10000000 loops, best of 3: 0.115 usec per loop $ python -m timeit "'hello' in {'goodbye', 'hello'}" 10000000 loops, best of 3: 0.115 usec per loop So the tuple is 2-3 times faster in this test. In fact, there seems to be no effective crossover on my system. Worst case with 10000 strings ('benchmark' isn't in /usr/share/doc/words): $ python -m timeit "'benchmark' in ( $(for word in $(head -10000 /usr/share/dict/words); do echo "'$word', "; done) 'benchmark')" 10000 loops, best of 3: 115 usec per loop $ python -m timeit "'benchmark' in { $(for word in $(head -10000 /usr/share/dict/words); do echo "'$word', "; done) 'benchmark'}" 1000 loops, best of 3: 418 usec per loop I guess the overhead of constructing the data structure dominates when there's only a single lookup. > that should be beaten to death. Can we discuss best coding practices using less violent metaphors, please?

Ralph Giles (:rillian)

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/ba871c6324bf https://hg.mozilla.org/mozilla-central/rev/8d7af888331b

Status: NEW → RESOLVED

Closed: 11 years ago

status-firefox38: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla38

BMO Automation

Updated

•

8 years ago

Product: Core → Firefox Build System

You need to log in before you can comment on or make changes to this bug.