Closed
Bug 623129
Opened 12 years ago
Closed 12 years ago
x86_64 NS_InvokeByIndex_P asm decl to reserve registers is not guaranteed to work as advertised
Categories
(Core :: XPCOM, defect)
Tracking
()
RESOLVED
FIXED
mozilla2.0b12
People
(Reporter: espindola, Assigned: espindola)
References
Details
Attachments
(1 file, 3 obsolete files)
2.54 KB,
patch
|
glandium
:
review+
shaver
:
approval2.0+
|
Details | Diff | Splinter Review |
The inline asm used in NS_InvokeByIndex_P is unsafe according to the rules defined in http://gcc.gnu.org/onlinedocs/gcc/Local-Reg-Vars.html and is likely to not work in clang.
Assignee | ||
Updated•12 years ago
|
Attachment #501248 -
Attachment is patch: true
Attachment #501248 -
Attachment mime type: application/octet-stream → text/plain
Attachment #501248 -
Flags: review?(benjamin)
Assignee | ||
Updated•12 years ago
|
Blocks: clang-macosx
Hardware: x86 → x86_64
Summary: unsafe and unportable inline assembly → x86_64 NS_InvokeByIndex_P asm decl to reserve registers does is not guaranteed to work as advertised
Summary: x86_64 NS_InvokeByIndex_P asm decl to reserve registers does is not guaranteed to work as advertised → x86_64 NS_InvokeByIndex_P asm decl to reserve registers is not guaranteed to work as advertised
Updated•12 years ago
|
Assignee: nobody → respindola
Updated•12 years ago
|
Attachment #501248 -
Flags: review?(benjamin) → review?(mh+mozilla)
Comment 1•12 years ago
|
||
Comment on attachment 501248 [details] [diff] [review] patch I don't see how your patch makes any kind of difference regarding the url your give.
Assignee | ||
Comment 2•12 years ago
|
||
In that document it is written that This option does not guarantee that GCC will generate code that has this variable in the register you specify at all times. You may not code an explicit reference to this register in the assembler instruction template part of an asm statement and assume it will always refer to this variable. However, using the variable as an asm operand guarantees that the specified register is used for the operand. The old asm statement does nothing and only in that nothingness are the variables guaranteed to be bound to the selected registers. The current code therefore depends on undefined behavior. We would like to warn about this (http://llvm.org/bugs/show_bug.cgi?id=8881) but it is not trivial to do so.
Updated•12 years ago
|
Attachment #501248 -
Flags: review?(mh+mozilla) → review+
Assignee | ||
Updated•12 years ago
|
Attachment #501248 -
Flags: approval2.0?
Updated•12 years ago
|
Attachment #501248 -
Flags: approval2.0? → approval2.0+
Assignee | ||
Updated•12 years ago
|
Keywords: checkin-needed
Comment 3•12 years ago
|
||
I think this should be pushed to try before landing.
Keywords: checkin-needed
Assignee | ||
Comment 4•12 years ago
|
||
So, I am not 100% sure what to make of this. The try is http://tbpl.mozilla.org/?tree=MozillaTry&rev=80d91338450e It shows a "make check" failure on linux x86-64 that I cannot reproduce. I copied the exact mozconfig and make check finishes just fine.
Assignee | ||
Comment 5•12 years ago
|
||
Attachment #501248 -
Attachment is obsolete: true
Assignee | ||
Updated•12 years ago
|
Attachment #506264 -
Flags: review?(mh+mozilla)
Assignee | ||
Comment 6•12 years ago
|
||
The new attached patch passes the try servers: http://tbpl.mozilla.org/?tree=MozillaTry&rev=0c60908ab6ec the differences are: *) the unused caller saved registers r10 and r11 to the clobber list. *) Move the last load to the asm itself *) Change the return to use the old style "=a" constraint. It looks like gcc only likes register variables on inputs. *) Add volatile. It is not clear why this is needed. The only difference is the produced assembly is that without it gcc would move the call after the restore of callee saved register. Maybe the function we are calling doesn't save them? If that is the case I can try adding those registers to the clobber list instead.
Comment 7•12 years ago
|
||
(In reply to comment #6) > The new attached patch passes the try servers: > > http://tbpl.mozilla.org/?tree=MozillaTry&rev=0c60908ab6ec > > the differences are: > *) the unused caller saved registers r10 and r11 to the clobber list. That's assuming these are the registers the compiler will be using, which you can't know for sure. I don't even see why you'd need these in the clobber list. > *) Move the last load to the asm itself > *) Change the return to use the old style "=a" constraint. It looks like gcc > only likes register variables on inputs. > *) Add volatile. It is not clear why this is needed. The only difference is the > produced assembly is that without it gcc would move the call after the restore > of callee saved register. Maybe the function we are calling doesn't save them? > If that is the case I can try adding those registers to the clobber list > instead. Volatile makes the instruction unmoveable, maybe without volatile, gcc makes an unsmart optimization, and can't finish with a tail-call optimization. The methods called by NS_InvokeByIndex_P are standard c++ functions, so registers preservation is their job.
Assignee | ||
Comment 8•12 years ago
|
||
We are calling an unknown function, we have to assume that any caller saved registers can be modified. That is why I added r10 and r11 to the clobber list. Volatile makes the statement unmovable. I think I found the real problem, gcc was scheduling the call after the cleanup. Declaring the call as using the stack pointer solves the problem.
Assignee | ||
Comment 9•12 years ago
|
||
Attachment #506264 -
Attachment is obsolete: true
Attachment #506264 -
Flags: review?(mh+mozilla)
Assignee | ||
Comment 10•12 years ago
|
||
The corresponding try: http://tbpl.mozilla.org/?tree=MozillaTry&rev=a49a97e792b6
Assignee | ||
Updated•12 years ago
|
Attachment #506413 -
Flags: review?(mh+mozilla)
Assignee | ||
Comment 11•12 years ago
|
||
I finally looked at which xmm registers were being used. Turn out that exactly the ones used by the ABI, so we don't need inline asm at all! http://tbpl.mozilla.org/?tree=MozillaTry&rev=14b6d1703f74
Attachment #506413 -
Attachment is obsolete: true
Attachment #506413 -
Flags: review?(mh+mozilla)
Assignee | ||
Updated•12 years ago
|
Attachment #507330 -
Flags: review?(mh+mozilla)
Comment 12•12 years ago
|
||
Comment on attachment 507330 [details] [diff] [review] Don't use inline asm clever.
Attachment #507330 -
Flags: review?(mh+mozilla) → review+
Assignee | ||
Updated•12 years ago
|
Attachment #507330 -
Flags: approval2.0?
Comment 13•12 years ago
|
||
Gavin, Shaver, do you guys want to approve this patch, please?
Comment on attachment 507330 [details] [diff] [review] Don't use inline asm Any perf impact here? (There are some performance tests somewhere under xpconnect.) a=shaver if they come back clean.
Attachment #507330 -
Flags: approval2.0? → approval2.0+
Assignee | ||
Comment 15•12 years ago
|
||
The performance tests are in js/src/xpconnect/tests? How do I run them? I don't expect any performance difference, but all I can be sure so far is that the new file is a bit smaller: __TEXT __DATA __OBJC others dec hex 933 0 0 0 933 3a5 xptcinvoke_x86_64_unix.o-new 1136 0 0 0 1136 470 xptcinvoke_x86_64_unix.o-old
Comment 16•12 years ago
|
||
If this now has r and a2.0, why isn't this marked with checkin-needed??
OS: Mac OS X → Windows 7
Updated•12 years ago
|
OS: Windows 7 → Linux
Assignee | ||
Comment 17•12 years ago
|
||
I couldn't figure out which performance tests Mike Shaver was asking for.
Comment 18•12 years ago
|
||
(In reply to comment #17) > I couldn't figure out which performance tests Mike Shaver was asking for. Talos runs, compared with <http://perf.snarkfest.net/compare-talos/>. So you should push this patch to the try, enable talos runs in trychooser (and perhaps disable the rest of the test suites), then ping the buildduty to ask for several talos runs on your try push, and then use the compare-talos tool to see what performance impact, if any, your patch has.
Updated•12 years ago
|
Keywords: checkin-needed
Assignee | ||
Comment 19•12 years ago
|
||
The try is at http://tbpl.mozilla.org/?tree=MozillaTry&rev=8b4fd7413fc6 How do I "ping the buildduty to ask for several talos runs"?
Clearing checkin-needed until the perf stuff is sorted out.
Keywords: checkin-needed
Sorry, the perf I was asking about was a invocation-time test that I can't find, and may not exist. I'm OK with this landing on the assumption that fewer instructions will run faster, and backing out if not. (I doubt we'll see anything move; I wish this was a hot part of our code!)
Keywords: checkin-needed
Comment 22•12 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/236989b3a807
Status: NEW → RESOLVED
Closed: 12 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0b12
You need to log in
before you can comment on or make changes to this bug.
Description
•