Last Comment Bug 265084 - Need to deal with integer overflows in reflow (consider making nscoords be floats)
: Need to deal with integer overflows in reflow (consider making nscoords be fl...
Status: NEW
:
Product: Core
Classification: Components
Component: Layout (show other bugs)
: Trunk
: All All
: -- normal with 7 votes (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
Mentors:
: 183986 748257 (view as bug list)
Depends on:
Blocks: 293940 391879 411319 552412 575011 265027 266360 266445 323381
  Show dependency treegraph
 
Reported: 2004-10-19 08:32 PDT by Robert O'Callahan (:roc) (email my personal email if necessary)
Modified: 2016-08-19 20:41 PDT (History)
37 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
prototype nscoord implementation with test code (6.75 KB, patch)
2004-10-20 23:35 PDT, Robert O'Callahan (:roc) (email my personal email if necessary)
no flags Details | Diff | Splinter Review
revised testcase (7.28 KB, patch)
2004-10-21 07:46 PDT, Robert O'Callahan (:roc) (email my personal email if necessary)
no flags Details | Diff | Splinter Review
starting point (18.69 KB, patch)
2004-10-29 16:11 PDT, Robert O'Callahan (:roc) (email my personal email if necessary)
bzbarsky: review+
bzbarsky: superreview+
Details | Diff | Splinter Review
fix NSTwipsToIntPixels (930 bytes, patch)
2004-11-02 19:30 PST, Robert O'Callahan (:roc) (email my personal email if necessary)
bzbarsky: review+
bzbarsky: superreview+
Details | Diff | Splinter Review
remove nscoord from imglib interfaces and implementation (43.20 KB, patch)
2004-11-07 20:52 PST, Robert O'Callahan (:roc) (email my personal email if necessary)
tor: review+
tor: superreview+
Details | Diff | Splinter Review
quiet compiler (492 bytes, patch)
2004-11-08 18:59 PST, timeless
bzbarsky: review+
bzbarsky: superreview+
Details | Diff | Splinter Review
A stab at what might be needed (280.11 KB, application/x-gzip)
2007-03-28 18:14 PDT, Jeremy Lea
no flags Details

Description Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-19 08:32:35 PDT
I basically haven't worried integer overflows in reflow, on the grounds that
it's hard to do anything right, but it looks like we might have to. (I thought
we already had a bug on this but I can't find it.)

One idea is to make nscoord a class with methods and operators, and have it do
internal overflow checking. I don't like this because it would eat performance
and codesize. It also introduces potential failures everywhere that we don't
know how to deal with, so it doesn't really solve the problem.

A better idea, I think, is to clamp incoming nscoord values from the style
system, and have reflow *in some places* use methods to add and subtract
coordinates while clamping them to some range, say +/- 2^28 twips.
Comment 1 Bernd 2004-10-19 08:42:57 PDT
what I like about this is that we could in debug mode catch NS_UNCONSTRAINEDSIZE
arithmetics
Comment 2 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-19 09:39:19 PDT
That's a good point.

Maybe the class is the way to go, with default + and - operations out-of-line
functions that do clamping and are defined appropriately on NS_MAXSIZE. Then we
try to reduce the size/speed performance hit by replacing selected calls to + or
- with 'unsafe' functions that are inlined adds/subtracts. Or we could factor
out common idioms (e.g., discovering the offset between a frame and its
ancestor) into specialized optimized functions.
Comment 3 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-19 21:03:28 PDT
Does anyone recall why we don't just use floating point coordinates in layout? I
thought I knew a good reason, but I can't remember it right now. It would fix
the overflow problem rather cleanly.
Comment 4 Boris Zbarsky [:bz] (TPAC) 2004-10-19 21:46:49 PDT
Weren't there issues with round-off?  Not that we don't have plenty of those
with integer coordinates...
Comment 5 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-19 21:55:21 PDT
Assuming IEEE 'float' (a safe bet, I think), we get a 23 bit mantissa (not
including sign), which means we can represent +/- 2^23 twips precisely, so we'd
be precise up to about 500,000 pixels. I'm assuming we'd make sure our units
code continues to round everything to the nearest twip (so we *don't* say 1cm =
3210.33191371113302 twips or whatever, just 3210).

There may be residual issues with intermediate computations and so forth, but I
suspect fixing them is probably a lot easier than dealing with the possibility
of overflow everywhere.
Comment 6 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-19 21:56:27 PDT
In other words, if we do it right, we should be confident we're not changing
behaviour for pages with coordinates limited to the range of +/- 500K pixels.
Comment 7 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-19 22:02:36 PDT
There might be performance issues, but it could cut both ways:
-- integer operations are faster than floating point on old chips, but
-- not necessarily so on newer chips, plus
-- floating point uses different functional units, potentially freeing up
integer units to do other work in parallel
-- floating point values go in floating point registers, potentially relieving
register pressure in the general purpose registers
-- we might add floating point/integer conversions, which can be expensive, but
NSIntPixelsToTwips and related functions already do int->float->int conversion,
so we might reduce conversions as well
-- I have no idea about code size. On x86, FP instructions are slightly longer,
but we might gain a little from reducing spill code.
Comment 8 Boris Zbarsky [:bz] (TPAC) 2004-10-19 22:05:57 PDT
Have to be careful when dividing and multiplying.  Right now those naturally
round to the nearest twip...

In any case, the only other counter-argument I've heard when I've asked this
question is that floating-point math is slower.  I'm not sure how true that is
on current processors and how the cost would compare to that of doing overflow
checks.

(Note that I'm just repeating what I recall being told when I asked this very
question.... I don't know enough about this sort of issue myself.)
Comment 9 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-20 05:19:47 PDT
> Have to be careful when dividing and multiplying.  Right now those naturally
> round to the nearest twip...

Good point ... that might motivate wrapping the float in a class.

One advantage of using floats is that we could make NS_MAXSIZE be IEEE Positive
Infinity, which has exactly the properties we want (e.g., Infinity - N == Infinity).
Comment 10 Christian :Biesinger (don't email me, ping me on IRC) 2004-10-20 06:19:19 PDT
(In reply to comment #6)
> In other words, if we do it right, we should be confident we're not changing
> behaviour for pages with coordinates limited to the range of +/- 500K pixels.

when using double (PRFloat64?), that range would increase by a lot; although
that may not matter... I don't know the perf issues with that...
Comment 11 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-20 06:21:26 PDT
That would make us use a *lot* more memory.

One issue we'd run into with floats is what to do when you have to convert an
nscoord back into a float, which happens in many places. I guess this is where
overflow checking has to happen. A lot better than doing it everywhere in
layout, I think.
Comment 12 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-20 23:35:51 PDT
Created attachment 162805 [details] [diff] [review]
prototype nscoord implementation with test code
Comment 13 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-21 00:03:16 PDT
Here are some issues I ran into, with gcc 3.3 -O3:

-- it's not good at following optimizations through int-to-float conversion. For
example I originally had something like
PRBool operator<(nscoord a, PRInt32 b) { return a < nscoord(b); }
then I have test code that does "nscoord x; ... (x < 0)". This was being turned
into a runtime int-to-float conversion, which is baaaad. I found I could get
around this (in gcc 3.3 at least) by throwing out all my PRInt32 methods and
letting gcc use the float methods instead, doing the int-float conversion
implicitly for me. That made the conversion happen at compile time.
Unfortunately it created ambiguities while I still had an implicit
convert-to-PRInt32 operator in nscoord --- operator<(nscoord, float) and
operator<(PRInt32, int) are ambiguous when applied to (nscoord, int),
apparently. So I had to take out the implicit convert-to-PRInt32 operator.

-- I originally had the construct-nscoord-from-float code do truncf on the
incoming value to ensure it was never fractional. But gcc 3.3 was not doing
constant propagation through truncf so we ended up with a lot of runtime truncfs
on constants. So I made it an assertion.

-- Adding -funsafe-math-optimizations made a big difference, shortening
instruction paths by a few instructions. I don't know if we can safely turn this
on, either just in layout or globally. According to the gcc manual it may cause
gcc to link with something special that sets the state of the floating point
units (e.g., changing exception behaviour). I don't know what it actually does
on the platforms that matter to us.

-- nscoords can be passed by value (good) but return is always via an indirect
pointer (bad), But not too bad since we return points and rects a lot more than
single coords AFAIK, and they're already going indirectly.

-- No constant folding over nscoords is being performed. That probably doesn't
matter to us.

-- I tried tree-ssa. In gcc 3.3 the code it produces is entirely incorrect :-)

Here's some annotated code:
nscoord do_const1(nscoord x) {
  nscoord z(10);
  return z;
}
becomes
        pushl   %ebp
        movl    %esp, %ebp
        movl    8(%ebp), %eax
        movl    $0x41200000, %ecx
        movl    %ecx, (%eax)
        leave
        ret     $4

void foobar1(nscoord x) {
  if (x > 0) {
    printf("ZOUNDS!\n");
  }
}
becomes
        pushl   %ebp
        movl    %esp, %ebp
        flds    8(%ebp)
        fcomps  .LC14
        fnstsw  %ax
        testb   $69, %ah
        je      .L41
        leave
        ret
.L41:
        movl    $.LC15, 8(%ebp)
        leave
        jmp     puts
.LC14:
        .long   0

nscoord fooadder(nscoord x, nscoord y, nscoord z) {
  return x + y + z;
}
becomes
        pushl   %ebp
        movl    %esp, %ebp
        flds    16(%ebp)
        fadds   12(%ebp)
        movl    8(%ebp), %eax
        fadds   20(%ebp)
        fstps   (%eax)
        leave
        ret     $4

nscoord total_blah(int count, nscoord* elems, PRBool* extras, nscoord delta) {
  nscoord sum;
  for (int i = 0; i < count; ++i) {
    sum += elems[i];
    if (extras[i]) {
      sum += delta;
    }
  }
  return sum;
}
becomes
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %esi
        pushl   %ebx
        xorl    %ecx, %ecx
        movl    12(%ebp), %ebx
        cmpl    %ebx, %ecx
        movl    8(%ebp), %eax
        movl    20(%ebp), %esi
        flds    24(%ebp)
        jge     .L66
        movl    16(%ebp), %edx
.L63:
        flds    (%edx)
        fadds   (%eax)
        cmpl    $0, (%esi,%ecx,4)
        fsts    (%eax)
        je      .L67
        fadd    %st(1), %st
        fstps   (%eax)
.L58:
        incl    %ecx
        addl    $4, %edx
        cmpl    %ebx, %ecx
        jl      .L63
.L66:
        fstp    %st(0)
        popl    %ebx
        popl    %esi
        leave
        ret     $4
.L67:
        fstp    %st(0)
        jmp     .L58

I will try gcc 3.4 soon.

In summary I think things are not *too* bad with these settings. It would be
very important to retry this experiment with VC++ of course. There's still the
question of how much work it would be to convert all the code over, especially
without an implicit coord-to-PRInt32 conversion. 
Comment 14 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-21 00:10:30 PDT
There is another problem in the last function: 'sum' is not kept in a register.
Comment 15 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-21 00:14:26 PDT
I wonder if gcc thinks that the function result pointer could be aliased with
'elems', so it insists on storing sum every time.
Comment 16 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-21 06:07:41 PDT
gcc 3.4.2:

-- -funsafe-math-optimizations is less needed because they use a different code
sequence for conditionals on floating point values, but it still helps
-- keeps 'sum' in a register in total_blah() (it still stores to the memory
location every time through the loop, but that hardly matters)

The other comments still hold, except
> -- No constant folding over nscoords is being performed.
I was wrong, even gcc 3.3 is doing this. Maybe not as aggressively as for
integers, it's hard to tell.
Comment 17 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-21 07:46:35 PDT
Created attachment 162852 [details] [diff] [review]
revised testcase

This thing actually runs. In gcc 3.4 the code in main() actually turns out
pretty well (it inlines total_blah). Accessing positive infinity in foobar3
looks okay, but I suspect could be better.

	pushl	%ebp
	movl	$2139095040, %edx
	movl	%esp, %ebp
	subl	$8, %esp
	movl	%edx, -4(%ebp)
	flds	8(%ebp)
	flds	-4(%ebp)
	fcompp
	fnstsw	%ax
	sahf
	je	.L64
	leave
	ret
.L64:
	movl	$.LC17, (%esp)
	call	puts
	leave
	ret
Comment 18 Mike Shaver (:shaver -- probably not reading bugmail closely) 2004-10-21 08:25:02 PDT
-funsafe-math-optimizations is known to break the JS engine, so we shouldn't be
turning it on globally, and possibly not even in layout unless we're careful to
reset the FPU mode before entering JS code.
Comment 19 neil@parkwaycc.co.uk 2004-10-21 09:34:49 PDT
There are some cases that multiply nscoords so you might consider putting in a
non-truncating version to multiply nscoords with each other (rather than with a
float whose decimal portion you keep but you want to truncate the result).
Comment 20 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-21 09:46:40 PDT
Does it actually change the FPU mode, on x86, or just the generated code? Where
would this be documented?
Comment 21 Mike Shaver (:shaver -- probably not reading bugmail closely) 2004-10-21 10:12:28 PDT
Sorry, I'm thinking of -ffast-math (see bug 261618 comment 4), which may not
reset the mode, indeed.
Comment 22 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-21 10:24:40 PDT
-ffast-math does several things include turn on -funsafe-math-optimizations. Hrm.
Comment 23 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-21 10:28:59 PDT
But it looks like in that bug, the problem was -ffinite-math-only, which we
definitely don't want here. I'll make a build with -funsafe-math-optimizations
and see how it goes.
Comment 24 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-21 14:06:10 PDT
It seems that at our current optimization levels, not all calls through the
overloaded nscoord operators get inlined. We could twiddle the code and
optimization levels to try to get closer to what we want, but that's very
fragile. Currently I think the nscoord wrapper class is not an option. I'm going
to have a look at just typedefing nscoord to float and dealing with the fallout.
Comment 25 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-10-29 16:11:30 PDT
Created attachment 163907 [details] [diff] [review]
starting point

I'd like to try doing this incrementally. Reserving nscoords for twips
coordinates and using PRInt32 for pixel counts is good cleanup anyway, IMHO.
I'd like to get nscoord separated enough from PRInt32 that eventually flipping
the switch between PRInt32 and float for nscoord is easy.

This patch is the first step. It introduces the switch and makes
nsUnitConversion and other facilities depend on the switch. It also introduces
nsIntRect, nsIntPoint, nsIntSize and nsIntMargin structures to be used for
rects, points, sizes and margins that are measured in pixels not twips.

After this I will submit patches converting various modules and interfaces to
use the right type for the right units. It's easy to see that as long as the
switch is commented off, changing between nsFoo and nsIntFoo, or PRInt32 and
nscoord, doesn't change any behavior.
Comment 26 Boris Zbarsky [:bz] (TPAC) 2004-11-01 21:17:51 PST
Comment on attachment 163907 [details] [diff] [review]
starting point

>Index: gfx/public/nsCoord.h

>+inline nscoord NSCoordMultiply(nscoord aCoord, float aVal) {
>+#ifdef NS_COORD_IS_FLOAT

Want to VERIFY_COORD here?  Likewise for the other methods that take nscoord
that you introduced?

>+inline PRInt32 NSCoordToInt(nscoord aCoord) {
>+  if (aCoord < -2147483647.5f) {
...
>+  } else if (aCoord > 2147483646.5f) {

Some documentation of what these magic values are would be good (can we get
them by casting PR_INT32_MIN & company to floats?).

Also, I didn't check that the numbers you have there have low enough precision
to be represented as a float without rounding.	I assume you checked that.

>Index: xpcom/ds/nsUnitConversion.h

> inline nscoord NSToCoordRound(float aValue)
> {
>+#ifdef NS_COORD_IS_FLOAT
>+  return floorf(aValue + ROUND_CONST_FLOAT);
>+#else

I guess we're assuming we can't use rint() or nearbyintf() or any thing like
that?

> inline PRInt32 NSToIntFloor(float aValue)
> {
>+#ifdef NS_COORD_IS_FLOAT
>+  return floorf(aValue);

Why does this depend on what nscoord is?  There are no nscoords in sight here.

> inline PRInt32 NSToIntCeil(float aValue)

Same.

> inline nscoord NSFloatPixelsToTwips(float aPixels, float aTwipsPerPixel)
> {
>+#ifdef NS_COORD_IS_FLOAT
>+  nscoord r = aPixels * aTwipsPerPixel;
>+#else
>+  nscoord r = NSToCoordRound(aPixels * aTwipsPerPixel);
>+#endif
>+  VERIFY_COORD(r);

So... if aPixels is some random float (which it could be, here), and
aTwipsPerPixel is some integer-valued float (which we enforce), you get some
random float in r.  There's no reason it'll be a valid coord (integer-valued
float).  I think you need NSToCoordRound() in both cases... or floorf, or
something.

r+sr=bzbarsky with these issues fixed.
Comment 27 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-11-02 04:09:24 PST
(In reply to comment #26)
> (From update of attachment 163907 [details] [diff] [review])
> >Index: gfx/public/nsCoord.h
> 
> >+inline nscoord NSCoordMultiply(nscoord aCoord, float aVal) {
> >+#ifdef NS_COORD_IS_FLOAT
> 
> Want to VERIFY_COORD here?  Likewise for the other methods that take nscoord
> that you introduced?

Sure.

> >+inline PRInt32 NSCoordToInt(nscoord aCoord) {
> >+  if (aCoord < -2147483647.5f) {
> ...
> >+  } else if (aCoord > 2147483646.5f) {
> 
> Some documentation of what these magic values are would be good (can we get
> them by casting PR_INT32_MIN & company to floats?).

Yeah, (float)PR_INT32_MAX + 0.5f and (float)PR_INT32_MIN - 0.5f

> Also, I didn't check that the numbers you have there have low enough precision
> to be represented as a float without rounding.	I assume you checked
> that.

They cannot be represented exactly as floats. So yeah, this could blow up in
some edges cases. Hmm. Turns out the largest 32-bit signed integer representable
as a float is 2147483520, and the smallest is -2147483648. So I'll turn it into
> and < tests for those two values.

> >Index: xpcom/ds/nsUnitConversion.h
> 
> > inline nscoord NSToCoordRound(float aValue)
> > {
> >+#ifdef NS_COORD_IS_FLOAT
> >+  return floorf(aValue + ROUND_CONST_FLOAT);
> >+#else
> 
> I guess we're assuming we can't use rint() or nearbyintf() or any thing like
> that?

I looked at a copy of VC++'s math.h, it's rather impoverished. They don't have
many ...f functions.

> > inline PRInt32 NSToIntFloor(float aValue)
> > {
> >+#ifdef NS_COORD_IS_FLOAT
> >+  return floorf(aValue);
> 
> Why does this depend on what nscoord is?  There are no nscoords in sight here.
> 
> > inline PRInt32 NSToIntCeil(float aValue)
> 
> Same.

Mistake. I'll remove it.

> > inline nscoord NSFloatPixelsToTwips(float aPixels, float aTwipsPerPixel)
> > {
> >+#ifdef NS_COORD_IS_FLOAT
> >+  nscoord r = aPixels * aTwipsPerPixel;
> >+#else
> >+  nscoord r = NSToCoordRound(aPixels * aTwipsPerPixel);
> >+#endif
> >+  VERIFY_COORD(r);
> 
> So... if aPixels is some random float (which it could be, here), and
> aTwipsPerPixel is some integer-valued float (which we enforce), you get some
> random float in r.  There's no reason it'll be a valid coord (integer-valued
> float).  I think you need NSToCoordRound() in both cases... or floorf, or
> something.

For some reason I assumed that aPixels was an integral number of pixels. I'll
NSToCoordRound in both cases.
Comment 28 Boris Zbarsky [:bz] (TPAC) 2004-11-02 08:30:33 PST
> So I'll turn it into > and < tests for those two values.

Sounds good, with a comment about why those values were chosen.
Comment 29 Nicholas Allen 2004-11-02 09:19:10 PST
Is there a reason to allow coords to go out to 2^32ish given that roundoff
errors start happening past 2^24?
Comment 30 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-11-02 09:33:17 PST
We certainly want to allow nscoords to be any size. That's the main point of
moving to floats. We might get underflow errors in some situations of the form A
+ B == A, but probably the only damage they will cause is possible infinite
loops in some algorithms, which we can detect and prevent quite easily ... much
more easily than doing integer overflow checking everywhere.

The only time you should need to convert an nscoord to an integer is when you're
converting to a device pixel value. So actually we should make the
twips-to-ints/points conversions use NSCoordToInt and there should be few or no
other callers to NSCoordToInt. In almost all cases gigantic pixel offsets will
either be clipped to some known bound or run into some implementation-specific
limit on the size of widgets or whatever.
Comment 31 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-11-02 18:50:39 PST
checked in.
Comment 32 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-11-02 19:30:29 PST
Created attachment 164342 [details] [diff] [review]
fix NSTwipsToIntPixels

NSTwipsToIntPixels needs to go through NSCoordToInt to make sure out-of-bounds
values become the correct integer values. This doesn't change the behaviour for
PRInt32 nscoords.
Comment 33 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-11-02 19:32:47 PST
Comment on attachment 164342 [details] [diff] [review]
fix NSTwipsToIntPixels

minor fixup in nsUnitConversion
Comment 34 Bernd 2004-11-02 21:17:02 PST
Robert will
http://lxr.mozilla.org/seamonkey/source/layout/html/table/src/nsTableFrame.cpp#314
use your magic code?
Comment 35 Boris Zbarsky [:bz] (TPAC) 2004-11-02 21:18:18 PST
Comment on attachment 164342 [details] [diff] [review]
fix NSTwipsToIntPixels

r+sr=bzbarsky
Comment 36 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-11-02 21:57:36 PST
bernd: that should continue to just work
Comment 37 Hixie (not reading bugmail) 2004-11-04 21:18:00 PST
(What's the relationship between this and the pixel bug? bug 177805.)
Comment 38 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-11-05 05:33:17 PST
They're orthogonal, although the cleanup following here may make that bug easier
to fix.
Comment 39 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-11-07 20:52:24 PST
Created attachment 165092 [details] [diff] [review]
remove nscoord from imglib interfaces and implementation

libpr0n code always deals with raw pixel counts, because that's what images are
about. Make that explicit by converting nscoord and nscoord-based structures to
PRInt32.
Comment 40 timeless 2004-11-08 18:59:25 PST
Created attachment 165232 [details] [diff] [review]
quiet compiler

the current NSCoordToFloat causes this:
r:\mozilla\rel-i586-pc-msvc\dist\include\gfx\nsCoord.h(153) : warning C4244:
'return' : conversion from 'nscoord' to 'float', possible loss of data
for each and every file that includes nsCoord.h. This is very annoying.
Comment 41 Boris Zbarsky [:bz] (TPAC) 2004-11-08 20:54:18 PST
roc, I assume you'll fix the non-imagelib implementations of imgIDecoderObserver
and imgIContainerObserver to use nsIntRect in a separate patch?
Comment 42 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-11-09 11:54:08 PST
yeah
Comment 43 tor 2004-11-11 16:02:27 PST
Comment on attachment 165092 [details] [diff] [review]
remove nscoord from imglib interfaces and implementation

Since you're not making binary compatiblity changes, the UUIDs for
the interfaces can remain the same.  Once nscoord is changed, though,
everything using that type or derived types will need to change id.

It appears that gfxImageFrame.{h,cpp} were missed in this sweep.
Comment 44 Robert O'Callahan (:roc) (email my personal email if necessary) 2004-11-14 20:21:12 PST
Checked in patch 165092
Comment 45 Sergei Dolgov 2005-01-21 12:01:13 PST
Sorry, for lameness, but here is the question:
can we get some additional benefit from those changes, if platform uses float
natively?
(In BeOS, for example, all drawing happens if floating coordinate system, all
native bounds are reported in float format too. So in the port permanent
coordinate transformation from float to int and back is happening at each step)
Comment 46 Robert O'Callahan (:roc) (email my personal email if necessary) 2005-03-21 15:01:04 PST
Yes. Cairo uses floats natively too. I think we'll get some benefit by avoiding
integer-float conversions (which are relatively expensive on modern processors).
Comment 47 timeless 2005-06-01 16:22:49 PDT
Comment on attachment 165232 [details] [diff] [review]
quiet compiler

2004-11-08 20:57:
mozilla/gfx/public/nsCoord.h	1.7
Comment 48 Jeremy Lea 2007-03-28 18:14:47 PDT
Created attachment 259974 [details]
A stab at what might be needed

I've been watching my computer compute a lot in the last while, so I thought I would take a stab at looking into this bug...

The attachment is a rather large patch which is what is needed to get the TRUNK building with a struct instead of a typedef (sorry I had to gzip it).  It's a bit larger than it needs to be since I made the constructor explicit, so it needs lots of "nscoord(0)".  Anyway - these are the issues and things which I found:

1. You can't put a struct (with a constructor) in a union.  There are a few places where this is being done to save memory.  There's also one place where the nscoord is being squeezed into 29bits, which wont work for a float.

2. There are a number of different constants being used for invalid coords.  It would be best to define PR_INT32_MIN as NSCOORD_INVALID and use that instead of -1, 0xdeadbeef, NSCOORD_MIN and a few others.  There are also a few other magic values, including one in an enum, which doesn't work as a struct.

3. The patch has lots of rounding problems.  Particularly, I replaced all Coord/2 calls with NSCoordDivide(Coord,2), which rounds...  But this does indictate a number of places where there are problems and wrong assumptions.  For floats the divisions are a problem if you want to keep nscoords as integers.  I made a few mistakes also when I was starting out.  The next thing I plan on doing is going over the patch and fixing all of the rounding.

4. A few more helper functions could really be useful.  Particularly ones which convert a AppUnit nsRect into a Pixel nsIntRect, and back.  Also for nsSize, nsMargin, nsPoint, etc.  Other ones are a float ratio, a multiply/divide scale fuction, and functions to snap to pixel boundaries.  I'll work on those while looking at the rounding.

5. There are lots of functions which take 4 separate args when they could take a rect, or two for a point/size.

6. I found a few bugs in coord transforms.  I'll open separate bugs on those since the fixes are fairly obvious.

7. The mCursor bounds seems to be used as both AppUnits and pixels?  I've not finished digging into this.  But this is a great place to hide bugs...

8. The Thebes region code should be rewritten to use cairo/pixman regions?

9. A seperate type is needed to for integer device pixels.  This is so conversions from CSS pixels to device pixels are ensured.  This also means gfxIntRect, etc.

roc, I hope you don't mind the bug spam.
  -Jeremy
Comment 49 Robert O'Callahan (:roc) (email my personal email if necessary) 2007-03-28 20:03:23 PDT
WOW!!! This is awesome!

I haven't looked at it all, but here's a couple of questions:

-    nsRect   windowRect(aX, aY, aWidth, aHeight);
+    nsRect   windowRect = nsRect(nscoord(aX), nscoord(aY), nscoord(aWidth), nscoord
(aHeight));

Why have you made changes like this --- converting direct construction to copy construction?

The big question is this: have you verified that with reasonable optimization levels, say -O2 on gcc, the code generated by your nscoord is at least as good as what we currently have? In particular if nscoord values aren't placed in registers we're dead in the water. See the comments above in this bug. We would also need to test MSVC++, although I can do that for you if necessary.

If that works out, I'm gung-ho about this patch. A patch of this size will rot fast so I'll try to make this process go as fast as I can.

> The attachment is a rather large patch which is what is needed to get the TRUNK
> building with a struct instead of a typedef (sorry I had to gzip it).  It's a
> bit larger than it needs to be since I made the constructor explicit, so it
> needs lots of "nscoord(0)".

You *might* be able to hack something up such as

  nscoord& operator=(struct __something_we_never_define* aDummy) {
     _v = 0;
     return *this;
  }

Hopefully that means you can write "nscoord x = 0;" but not "nscoord x = 1;".

I think it was a good call to not have a general implicit conversion.

> Anyway - these are the issues and things which I found:
> 
> 1. You can't put a struct (with a constructor) in a union.  There are a few
> places where this is being done to save memory.  There's also one place where
> the nscoord is being squeezed into 29bits, which wont work for a float.

OK, we should probably just make those to PRUint32 and convert backwards and forwards.

> 2. There are a number of different constants being used for invalid coords.  It
> would be best to define PR_INT32_MIN as NSCOORD_INVALID and use that instead of
> -1, 0xdeadbeef, NSCOORD_MIN and a few others.  There are also a few other magic
> values, including one in an enum, which doesn't work as a struct.

OK, that sounds fixable.

> 3. The patch has lots of rounding problems.  Particularly, I replaced all
> Coord/2 calls with NSCoordDivide(Coord,2), which rounds...

We probably want a truncating division instead, to preserve existing behaviour.

> But this does
> indictate a number of places where there are problems and wrong assumptions. 
> For floats the divisions are a problem if you want to keep nscoords as
> integers.  I made a few mistakes also when I was starting out.  The next thing
> I plan on doing is going over the patch and fixing all of the rounding.

It'd be best to not change rounding behaviour in this patch --- it's already big enough :-). Just preserve it and if you think the behaviour might be wrong, add an XXX comment --- say "XXX ROUNDING" --- that we can search on later.

> 4. A few more helper functions could really be useful.  Particularly ones which
> convert a AppUnit nsRect into a Pixel nsIntRect, and back.  Also for nsSize,
> nsMargin, nsPoint, etc.  Other ones are a float ratio, a multiply/divide scale
> fuction, and functions to snap to pixel boundaries.  I'll work on those while
> looking at the rounding.

Yeah, that sounds great if it shrinks the size of this patch. If you find that it isn't shrinking the size of this patch, don't do it now.

> 5. There are lots of functions which take 4 separate args when they could take
> a rect, or two for a point/size.

Ditto. Also, if there are changes of that sort that we can make right away, that would simplify this patch, you could split them off and do them now. Also, this patch changes a bunch of things from nsSize/Rect/Point to nsIntSize/Rect/Point --- we could split those out and land them early to reduce the size of the patch.

> 6. I found a few bugs in coord transforms.  I'll open separate bugs on those
> since the fixes are fairly obvious.

Great

> 7. The mCursor bounds seems to be used as both AppUnits and pixels?  I've not
> finished digging into this.  But this is a great place to hide bugs...

Yeah, which is why I'd like to have this patch.

> 8. The Thebes region code should be rewritten to use cairo/pixman regions?

I don't think so --- pixman regions are limited to 16-bit coordinates, which is really lame. Anyway, nsIRegion and its implementations are very little used in the current codebase, and with a bit of effort we could make them go away completely, so we should not spend unnecessary effort on them.

> 9. A seperate type is needed to for integer device pixels.  This is so
> conversions from CSS pixels to device pixels are ensured.  This also means
> gfxIntRect, etc.
> 
> roc, I hope you don't mind the bug spam.

I'd like to receive this kind of "bug spam" every day :-).

Even if we don't switch to floating-point layout, having this struct would be very useful for catching/debugging integer overflows, which is becoming more important.
Comment 50 neil@parkwaycc.co.uk 2007-03-29 00:28:47 PDT
(In reply to comment #49)
> > The attachment is a rather large patch which is what is needed to get the TRUNK
> > building with a struct instead of a typedef (sorry I had to gzip it).  It's a
> > bit larger than it needs to be since I made the constructor explicit, so it
> > needs lots of "nscoord(0)".
> 
> You *might* be able to hack something up such as
> 
>   nscoord& operator=(struct __something_we_never_define* aDummy) {
>      _v = 0;
>      return *this;
>   }
> 
> Hopefully that means you can write "nscoord x = 0;" but not "nscoord x = 1;".

No, that's an explicit constructor.

Without reading the patch I'm guessing that Jeremy's referring to passing a number as a parameter to a method that expects an nscoord. If you write a constructor along the lines of your assignment operator then you would be able to write foo(0) instead of foo(nscoord(0)).
Comment 51 Jeremy Lea 2007-03-30 15:09:17 PDT
One more thing I forgot to mention before.  This patch would bitrot a lot of the debug code, because you can't pass a struct to printf...  I've converted all of the current debug code, but I'm not going to try enable every possible debug option and fix everyone's code...

(In reply to comment #49)
> Why have you made changes like this --- converting direct construction to copy
> construction?

Mostly because I fixed the first compile error, then found that the function signature was wrong, then fixed it, then didn't fix the first fix right... ;-)

> The big question is this: have you verified that with reasonable optimization
> levels, say -O2 on gcc, the code generated by your nscoord is at least as good
> as what we currently have? In particular if nscoord values aren't placed in
> registers we're dead in the water. See the comments above in this bug. We would
> also need to test MSVC++, although I can do that for you if necessary.

I haven't tested this at all...  My first priority has been to get it to compile, and to try to work out the various ins and outs of whats being done where, and to look at some of the rounding issues.

However, if you really wanted to be evil, then the simple solution here is to only use the struct in the debug case, and put in a bunch of asserts about not overflowing, or using unconstrained sizes and those sorts of things.  But use a typedef for the optimized builds.  Then throw lots of big coords and other junk at a debug build via the test suit, and put some real checks in at the places where coords can actually overflow, since these are probably limited.  I suspect the list consists of CSS->appunit conversions, long lines of text, and some places where inner boxes are added up to get the top width (tables and frames).  The various paths should show up quickly in assert backtraces, if a quick look through bugzilla doesn't show them already...

> If that works out, I'm gung-ho about this patch. A patch of this size will rot
> fast so I'll try to make this process go as fast as I can.

I'm happy to keep this up to date.  It's a build that I'm not using and it's on a machine that is mostly idle...  And I'll have some time to kill while I wait for some big finite element models to run...

>   nscoord& operator=(struct __something_we_never_define* aDummy) {
>      _v = 0;
>      return *this;
>   }

This works great!  Thanks for the tip.

> > 1. You can't put a struct (with a constructor) in a union.  There are a few
> > places where this is being done to save memory.  There's also one place where
> > the nscoord is being squeezed into 29bits, which wont work for a float.
> 
> OK, we should probably just make those to PRUint32 and convert backwards and
> forwards.

This would be another advantage to using the evil trick above.  You could use the typedef in a union, or have it be a struct for debug builds, where memory isn't an issue.  There are four of these.  Two in the style system, where they appear to be memory savers (nsROCSSPrimativeValue and nsStyleStruct), one in the table code (BCVerticalSeg), although that's not really saving much, and one introduced recently in the the table computations which appears to just be saving stack space.

> Yeah, that sounds great if it shrinks the size of this patch. If you find that
> it isn't shrinking the size of this patch, don't do it now.

These will shrink it considerably, and make it easier to read...

> Ditto. Also, if there are changes of that sort that we can make right away,
> that would simplify this patch, you could split them off and do them now. Also,
> this patch changes a bunch of things from nsSize/Rect/Point to
> nsIntSize/Rect/Point --- we could split those out and land them early to reduce
> the size of the patch.

I think that this probably the best place to start with landing this.  I'll go through my big patch and make up some smaller patches which just have these changes, which at the moment are noops, and then they can be landed before any playing with the coordinate stuff.

Regards,
  -Jeremy
Comment 52 Robert O'Callahan (:roc) (email my personal email if necessary) 2007-03-30 17:34:31 PDT
> However, if you really wanted to be evil, then the simple solution here is to
> only use the struct in the debug case

I suppose that's an option. It would be nice to avoid though, because there's a risk of making debug build behaviour diverge from opt build behaviour in subtle ways. (That risk is already present for other reasons, of course, so this is not a deal-breaker.)
Comment 53 Jeremy Lea 2007-04-04 16:38:06 PDT
Sorry I've been so quite on this...  I've been working on a big issue I uncovered by driving a few more of the forced conversions down.  The problem is that all  of the SVG code uses the nsRect mRect which is a member of nsIFrame, and the nsIFrame methods - but it uses pixels in all of its code.  This is really unhealthy.  I'm trying to figure out what to do about this...  Should the SVG code convert all of these to nscoords or should the SVG code be made to use a nsIntRect and leave the nsFrame interfaces alone?  I'm not sure.  The first means lots of conversions.  The second means that the nsIFrame interfaces wont be correct (well less correct).

I've tracked down the one issue with cursor coordinates.  I'll try to work on a separate patch for that soon.  The coordinates are being placed into an nsRect in pixels and fixed up later in the event handler.

Regards,
  -Jeremy
Comment 54 Robert O'Callahan (:roc) (email my personal email if necessary) 2007-04-04 17:20:53 PDT
An ugly but maybe good-enough solution might be to define a method in SVG frames like

    nsIntRect& GetSVGRect() {
        return *NS_REINTERPRET_CAST(nsIntRect*, &mRect);
    }

and use that everywhere in SVG
Comment 55 Robert O'Callahan (:roc) (email my personal email if necessary) 2007-04-04 17:22:02 PDT
tor, see last couple of comments.
Comment 56 tor 2007-04-05 15:05:17 PDT
I'm not sure I see the problem - SVG doesn't currently participate in the normal reflow mechanism, so we reused the mRect for our purposes on SVG leaves (and svg:fO) instead of having our own rect and wasting memory.  With your patch it appears that nsRect can still store the data we need, and the number of conversions needed in the SVG code appear small at first glance.
Comment 57 Jeremy Lea 2007-04-09 16:32:01 PDT
(In reply to comment #56)
> I'm not sure I see the problem - SVG doesn't currently participate in the
> normal reflow mechanism, so we reused the mRect for our purposes on SVG leaves
> (and svg:fO) instead of having our own rect and wasting memory.  With your
> patch it appears that nsRect can still store the data we need, and the number
> of conversions needed in the SVG code appear small at first glance.
> 

If (when?) nscoord is changed to a float, then you might encounter problems with float<->int conversion (besides wasting CPU cycles).  Looking through the uses, the mRect is mostly passed to InvalidateRect, which converts it to app units and passes it to Invalidate.  So you can just do the conversion up front, and not abuse the mRect...
Comment 58 tor 2007-05-03 09:37:38 PDT
Bug 379616 switches SVG frames' use of mRect to app units.
Comment 59 Jesse Ruderman 2011-07-25 13:22:14 PDT
*** Bug 183986 has been marked as a duplicate of this bug. ***
Comment 60 Mats Palmgren (:mats) 2012-04-24 13:23:40 PDT
*** Bug 748257 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.