Closed Bug 529348 Opened 15 years ago Closed 15 years ago

Improve xptcinvoke on windows ce arm

Categories

(Core :: XPCOM, defect)

ARM
Windows CE
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
status1.9.2 --- beta5-fixed
fennec 1.0a4-wm+ ---

People

(Reporter: blassey, Assigned: blassey)

References

Details

(Keywords: mobile)

Attachments

(1 file, 1 obsolete file)

Attached patch patch (obsolete) — Splinter Review
This patch improves start up by 2.1s in my stop watch testing. First run is improved by 2.766 seconds (only did one first run with and without though, so less statistically reliable).
Attachment #412906 - Flags: review?(vladimir)
Does it still cause crashes with the JIT enabled on some devices?
(In reply to comment #1) > Does it still cause crashes with the JIT enabled on some devices? No, I figured that out. The problem seems to have been not restoring the permanent registers when exiting asmXPTC_InvokeByIndex, specifically r7. The patch now handles that correctly as far as I can tell.
Comment on attachment 412906 [details] [diff] [review] patch >+ mov r6, r0 >+ mov r7, r1 >+ mov r8, r2 >+ mov r9, r3 >+ mov r10, r4 this was for debugging, pretty sure it should be removed
I'm not convinced. You don't need to restore the permanent registers unless you actually use them; bear in mind that memory accesses are expensive, so don't stack anything unless you have to. ---- Also, in the same function, you replaced my "blx ip" with "mov lr, pc; mov pc, ip". Why is this? You would want to do this for compatibility with ARMv4(T), but we don't support that in the JIT anyway so go with the (faster) BLX variant. A-class processors will perform return stack optimizations _if_ you use a proper branch-with-link instruction (BL or BLX). If you use "mov pc, ip", you're losing performance. (Similarly, you have to return properly, but you have many more options there; popping off the stack directly into PC is fine.)
(In reply to comment #4) > I'm not convinced. You don't need to restore the permanent registers unless you > actually use them; bear in mind that memory accesses are expensive, so don't > stack anything unless you have to. > I had read (on the internet.... stupid internet) that stmdb instructions took the same amount of time no matter how many registers you were dumping to the stack, which is why I felt at liberty to dump everything. I wouldn't argue if you told me this was completely wrong. > Also, in the same function, you replaced my "blx ip" with "mov lr, pc; mov pc, > ip". Why is this? You would want to do this for compatibility with ARMv4(T), > but we don't support that in the JIT anyway so go with the (faster) BLX > variant. > > A-class processors will perform return stack optimizations _if_ you use a > proper branch-with-link instruction (BL or BLX). If you use "mov pc, ip", > you're losing performance. (Similarly, you have to return properly, but you > have many more options there; popping off the stack directly into PC is fine.) I had 3 reasons for switching from blx to the mov's. First, our winmo emulators default to ARMv4, so there is some value in keeping compatibility since you can still run mozilla code on them with the jit off. Second, I thought blx switched the processor to THUMB mode, and I'm not sure that we want to do that (could be completely wrong on that). Finally, I just have it in my mind that branches are expensive in general, apparently I was wrong about that as well.
We really don't want to use movs instead of blx, even ignoring the winmo emulator.. add some #ifdefs for that, if necessary. The perf cost of not using blx isn't worth it. blx /can/ switch the CPU to thumb mode, if the low bit of the target address is set; it won't be for any of our addresses. Also, you're right, branches are in general expensive, but "mov pc, ip" is also a brach, but one that the cpu can't predict as well, thus making it even more expensive :-)
(In reply to comment #5) > I had read (on the internet.... stupid internet) that stmdb instructions took > the same amount of time no matter how many registers you were dumping to the > stack, which is why I felt at liberty to dump everything. I wouldn't argue if > you told me this was completely wrong. For the record: The STM instruction is the same _size_ regardless of the number of registers you list; that may be what the internet was referring to. (Indeed, _all_ ARM instructions are 32 bits long.) However, every register you save (or load for LDM/POP) is still a memory access.
Attached patch patch v.2Splinter Review
This patch removes the extraneous mov's and only stores 3 registers (plus sp and lr) on the stack. It also uses blx.
Assignee: nobody → bugmail
Attachment #412906 - Attachment is obsolete: true
Attachment #413650 - Flags: review?(vladimir)
Attachment #412906 - Flags: review?(vladimir)
Comment on attachment 413650 [details] [diff] [review] patch v.2 Looks fine -- >+ stmdb sp!, {r4 - r6, r12, lr} ; we're using regisers 4, 5 and 6. save them typo -- "registers" >+ mov r6, r0 ; store that Maybe put 'that' in quotes like below and put something like (the target 'this') after for clarity?
Attachment #413650 - Flags: approval1.9.2?
pushed http://hg.mozilla.org/mozilla-central/rev/b2106fb015d3 Also, I got the wrong bug number in the checking comment. Wish there was a way to fix that.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
This didn't need approval to land?
Sam, this was blocking fennec, although I see now that the flag is cleared. Resetting it.
tracking-fennec: --- → 1.0a4-wm+
These bugs landed after b4 was cut. Moving flag out.
(In reply to comment #13) > Sam, this was blocking fennec, although I see now that the flag is cleared. > Resetting it. Thanks Brad. I didn't see it when I looked at the bug, which is why I asked. I was also wrong about this bug landing after b4. It made the cut.
It didn't actually make the cut, since we built off af0e2b9566cc rather than the tip.
Attachment #413650 - Flags: approval1.9.2?
Comment on attachment 413650 [details] [diff] [review] patch v.2 a192=beltzner
Attachment #413650 - Flags: approval1.9.2+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: