Closed Bug 623129 Opened 9 years ago Closed 9 years ago

x86_64 NS_InvokeByIndex_P asm decl to reserve registers is not guaranteed to work as advertised

Categories

(Core :: XPCOM, defect)

x86_64
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla2.0b12

People

(Reporter: espindola, Assigned: espindola)

References

Details

Attachments

(1 file, 3 obsolete files)

Attached patch patch (obsolete) — Splinter Review
The inline asm used in NS_InvokeByIndex_P is unsafe according to the rules defined in http://gcc.gnu.org/onlinedocs/gcc/Local-Reg-Vars.html and is likely to not work in clang.
Attachment #501248 - Attachment is patch: true
Attachment #501248 - Attachment mime type: application/octet-stream → text/plain
Attachment #501248 - Flags: review?(benjamin)
Hardware: x86 → x86_64
Summary: unsafe and unportable inline assembly → x86_64 NS_InvokeByIndex_P asm decl to reserve registers does is not guaranteed to work as advertised
Summary: x86_64 NS_InvokeByIndex_P asm decl to reserve registers does is not guaranteed to work as advertised → x86_64 NS_InvokeByIndex_P asm decl to reserve registers is not guaranteed to work as advertised
Assignee: nobody → respindola
Attachment #501248 - Flags: review?(benjamin) → review?(mh+mozilla)
Comment on attachment 501248 [details] [diff] [review]
patch

I don't see how your patch makes any kind of difference regarding the url your give.
In that document it is written that

This option does not guarantee that GCC will generate code that has this variable in the register you specify at all times. You may not code an explicit reference to this register in the assembler instruction template part of an asm statement and assume it will always refer to this variable. However, using the variable as an asm operand guarantees that the specified register is used for the operand.

The old asm statement does nothing and only in that nothingness are the variables guaranteed to be bound to the selected registers. The current code therefore depends on undefined behavior.

We would like to warn about this (http://llvm.org/bugs/show_bug.cgi?id=8881) but it is not trivial to do so.
Attachment #501248 - Flags: review?(mh+mozilla) → review+
Attachment #501248 - Flags: approval2.0? → approval2.0+
I think this should be pushed to try before landing.
Keywords: checkin-needed
So, I am not 100% sure what to make of this. The try is

http://tbpl.mozilla.org/?tree=MozillaTry&rev=80d91338450e

It shows a "make check" failure on linux x86-64 that I cannot reproduce. I copied the exact mozconfig and make check finishes just fine.
Attached patch updated patch (obsolete) — Splinter Review
Attachment #501248 - Attachment is obsolete: true
The new attached patch passes the try servers:

http://tbpl.mozilla.org/?tree=MozillaTry&rev=0c60908ab6ec

the differences are:
*) the unused caller saved registers r10 and r11 to the clobber list.
*) Move the last load to the asm itself
*) Change the return to use the old style "=a" constraint. It looks like gcc only likes register variables on inputs.
*) Add volatile. It is not clear why this is needed. The only difference is the produced assembly is that without it gcc would move the call after the restore of callee saved register. Maybe the function we are calling doesn't save them? If that is the case I can try adding those registers to the clobber list instead.
(In reply to comment #6)
> The new attached patch passes the try servers:
> 
> http://tbpl.mozilla.org/?tree=MozillaTry&rev=0c60908ab6ec
> 
> the differences are:
> *) the unused caller saved registers r10 and r11 to the clobber list.

That's assuming these are the registers the compiler will be using, which you can't know for sure. I don't even see why you'd need these in the clobber list.

> *) Move the last load to the asm itself
> *) Change the return to use the old style "=a" constraint. It looks like gcc
> only likes register variables on inputs.
> *) Add volatile. It is not clear why this is needed. The only difference is the
> produced assembly is that without it gcc would move the call after the restore
> of callee saved register. Maybe the function we are calling doesn't save them?
> If that is the case I can try adding those registers to the clobber list
> instead.

Volatile makes the instruction unmoveable, maybe without volatile, gcc makes an unsmart optimization, and can't finish with a tail-call optimization. The methods called by NS_InvokeByIndex_P are standard c++ functions, so registers preservation is their job.
We are calling an unknown function, we have to assume that any caller saved registers can be modified. That is why I added r10 and r11 to the clobber list.

Volatile makes the statement unmovable. I think I found the real problem, gcc was scheduling the call after the cleanup. Declaring the call as using the stack pointer solves the problem.
Attached patch new patch (obsolete) — Splinter Review
Attachment #506264 - Attachment is obsolete: true
Attachment #506264 - Flags: review?(mh+mozilla)
I finally looked at which xmm registers were being used. Turn out that exactly the ones used by the ABI, so we don't need inline asm at all!

http://tbpl.mozilla.org/?tree=MozillaTry&rev=14b6d1703f74
Attachment #506413 - Attachment is obsolete: true
Attachment #506413 - Flags: review?(mh+mozilla)
Comment on attachment 507330 [details] [diff] [review]
Don't use inline asm

clever.
Attachment #507330 - Flags: review?(mh+mozilla) → review+
Gavin, Shaver, do you guys want to approve this patch, please?
Comment on attachment 507330 [details] [diff] [review]
Don't use inline asm

Any perf impact here?  (There are some performance tests somewhere under xpconnect.)

a=shaver if they come back clean.
Attachment #507330 - Flags: approval2.0? → approval2.0+
The performance tests are in js/src/xpconnect/tests? How do I run them? I don't expect any performance difference, but all I can be sure so far is that the new file is a bit smaller:

__TEXT	__DATA	__OBJC	others	dec	hex
933	0	0	0	933	3a5	xptcinvoke_x86_64_unix.o-new
1136	0	0	0	1136	470	xptcinvoke_x86_64_unix.o-old
If this now has r and a2.0, why isn't this marked with checkin-needed??
OS: Mac OS X → Windows 7
OS: Windows 7 → Linux
I couldn't figure out which performance tests Mike Shaver was asking for.
(In reply to comment #17)
> I couldn't figure out which performance tests Mike Shaver was asking for.

Talos runs, compared with <http://perf.snarkfest.net/compare-talos/>.  So you should push this patch to the try, enable talos runs in trychooser (and perhaps disable the rest of the test suites), then ping the buildduty to ask for several talos runs on your try push, and then use the compare-talos tool to see what performance impact, if any, your patch has.
Keywords: checkin-needed
The try is at

http://tbpl.mozilla.org/?tree=MozillaTry&rev=8b4fd7413fc6

How do I "ping the buildduty to ask for several talos runs"?
Clearing checkin-needed until the perf stuff is sorted out.
Keywords: checkin-needed
Sorry, the perf I was asking about was a invocation-time test that I can't find, and may not exist.  I'm OK with this landing on the assumption that fewer instructions will run faster, and backing out if not. (I doubt we'll see anything move; I wish this was a hot part of our code!)
Keywords: checkin-needed
http://hg.mozilla.org/mozilla-central/rev/236989b3a807
Status: NEW → RESOLVED
Closed: 9 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0b12
You need to log in before you can comment on or make changes to this bug.