Closed
Bug 226619
Opened 21 years ago
Closed 21 years ago
Peculiar code in DivideBCBorderSize
Categories
(Core :: Layout: Tables, defect)
Core
Layout: Tables
Tracking
()
RESOLVED
FIXED
People
(Reporter: MatsPalmgren_bugz, Assigned: MatsPalmgren_bugz)
References
()
Details
Attachments
(1 file)
629 bytes,
patch
|
bzbarsky
:
review+
bzbarsky
:
superreview+
|
Details | Diff | Splinter Review |
From nsTableFrame.cpp:
static void
DivideBCBorderSize(nscoord aPixelSize,
nscoord& aSmallHalf,
nscoord& aLargeHalf)
{
aSmallHalf = aPixelSize / 2;
aLargeHalf = ((aSmallHalf + aSmallHalf) < aPixelSize) ? aSmallHalf + 1 :
aSmallHalf;
}
I think it would be more readable (and faster) as:
aSmallHalf = aPixelSize / 2;
aLargeHalf = aPixelSize - aSmallHalf;
sr=dbaron
Assignee | ||
Comment 2•21 years ago
|
||
![]() |
||
Comment 3•21 years ago
|
||
Comment on attachment 136205 [details] [diff] [review]
Patch rev. 1
r=bzbarsky, marking the sr=dbaron.
Would be good to not lose this before 1.7a opens...
Attachment #136205 -
Flags: superreview+
Attachment #136205 -
Flags: review+
fix checked in
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•21 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Updated•21 years ago
|
Assignee: table → mats.palmgren
Status: REOPENED → NEW
Assignee | ||
Updated•21 years ago
|
Status: NEW → RESOLVED
Closed: 21 years ago → 21 years ago
Resolution: --- → FIXED
Comment 5•21 years ago
|
||
Why not convert this function into a macro (or at least an 'inline') call?
In most of the cases only aSmallHalf or aLargeHalf is used:
2770 DivideBCBorderSize(propData->mTopBorderWidth, smallHalf, largeHalf);
2771 border.top += NSToCoordRound(p2t * (float)smallHalf);
2772
2773 DivideBCBorderSize(propData->mRightBorderWidth, smallHalf, largeHalf);
2774 border.right += NSToCoordRound(p2t * (float)largeHalf);
Why not:
#define DIV_SMALL(aPixelSize ) ((aPixelSize) / 2)
#define DIV_LARGE(aPixelSize ) ((aPixelSize) - DIV_SMALL(aPixelSize) )
This prevents function calls, and the code overhead of the function itself.
The DIV_SMALL compiles to one instruction, the DIV_LARGE to two cpu instructions
which is still way less than a function call with three arguments.
Comment 6•20 years ago
|
||
On -O2 -GL builds, largeHalf gets optimized out if it isn't needed. But we
don't compile that way on Windows so Inline has merit. I think that
aSmallHalf = aPixelSize >> 1;
aLargeHalf = aSmallHalf + (aPixelSize & 1);
is better code as you don't have the baggage associated with division. The
code generated (-O2 -GL) with the divide and subtract is a bit odd:
movzx edi, BYTE PTR [ecx+17]
....
mov eax, edi (clock 1)
cdq (clock 2)
sub eax, edx (clock 3)
sar eax, 1 (clock 4)
sub edi, eax (clock 5)
The shift+add should look like:
movzx edi, BYTE PTR [ecx+17]
....
mov eax, edi (clock 1)
sar eax, 1 (clock 2)
and edi, 1 (clock 2)
add eax, edi (clock 3)
The first can't do instruction parallelism due to dependencies from instruction
to instruction. The latter can do parallelism because instructions 2 and 3
aren't dependent.
(In reply to comment #5)
> Why not convert this function into a macro (or at least an 'inline') call?
>
> In most of the cases only aSmallHalf or aLargeHalf is used:
> 2770 DivideBCBorderSize(propData->mTopBorderWidth, smallHalf, largeHalf);
> 2771 border.top += NSToCoordRound(p2t * (float)smallHalf);
> 2772
> 2773 DivideBCBorderSize(propData->mRightBorderWidth, smallHalf, largeHalf);
> 2774 border.right += NSToCoordRound(p2t * (float)largeHalf);
>
> Why not:
> #define DIV_SMALL(aPixelSize ) ((aPixelSize) / 2)
> #define DIV_LARGE(aPixelSize ) ((aPixelSize) - DIV_SMALL(aPixelSize) )
>
> This prevents function calls, and the code overhead of the function itself.
> The DIV_SMALL compiles to one instruction, the DIV_LARGE to two cpu instructions
> which is still way less than a function call with three arguments.
Comment 7•20 years ago
|
||
Summary of some testing:
No changes to the code let to 10 instructions, 2 calls (excluding the float
calculations for the following code:
nscoord smallHalf, largeHalf;
DivideBCBorderSize(propData->mTopBorderWidth, smallHalf, largeHalf);
border.top += NSToCoordRound(p2t * (float)smallHalf);
DivideBCBorderSize(propData->mRightBorderWidth, smallHalf, largeHalf);
border.right += NSToCoordRound(p2t * (float)largeHalf);
Variations tested:
Added inline:
15 instr, 0 calls (excl. fmul till __ftol)
With 'inline', and the smallHalf=x>>1, smallHalf+x&1 trick:
14 instructions, 0 calls (excl. fmul till __ftol)
With macros DIV_SMALL/DIV_LARGE:
#define DIV_SMALL(aPixelSize ) ((aPixelSize) >> 1)
#define DIV_LARGE(aPixelSize ) ((aPixelSize) - DIV_SMALL(aPixelSize) )
Code becomes:
border.top += NSToCoordRound(p2t *
(float)DIV_SMALL(propData->mTopBorderWidth));
border.right += NSToCoordRound(p2t *
(float)DIV_LARGE(propData->mRightBorderWidth));
13 instructions, 0 calls (excl. fmul till __ftol)
Alternative Macro's:
#define DIV_SMALL(aPixelSize ) ((aPixelSize) >> 1)
#define DIV_LARGE(aPixelSize ) (DIV_SMALL(aPixelSize) + ((aPixelSize)&1) )
14 instructions, 0 calls (excl. fmul till __ftol)
Probably the easiest change with the most results is to add 'inline' to the
DivideBCBorderSize function. Changing the calculations as per last comment,
results then in 14 instructions.
(Using macros one could further reduce to 13, but than means changes over a lot
of places (in one file however)).
So 'inline' please, less code per call and overall, and faster as well...
You need to log in
before you can comment on or make changes to this bug.
Description
•