Note: There are a few cases of duplicates in user autocompletion which are being worked on.

Need to deal with integer overflows in reflow (consider making nscoords be floats)

NEW
Unassigned

Status

()

Core
Layout
13 years ago
11 months ago

People

(Reporter: roc, Unassigned)

Tracking

(Blocks: 5 bugs)

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(6 attachments, 1 obsolete attachment)

I basically haven't worried integer overflows in reflow, on the grounds that
it's hard to do anything right, but it looks like we might have to. (I thought
we already had a bug on this but I can't find it.)

One idea is to make nscoord a class with methods and operators, and have it do
internal overflow checking. I don't like this because it would eat performance
and codesize. It also introduces potential failures everywhere that we don't
know how to deal with, so it doesn't really solve the problem.

A better idea, I think, is to clamp incoming nscoord values from the style
system, and have reflow *in some places* use methods to add and subtract
coordinates while clamping them to some range, say +/- 2^28 twips.

Comment 1

13 years ago
what I like about this is that we could in debug mode catch NS_UNCONSTRAINEDSIZE
arithmetics
That's a good point.

Maybe the class is the way to go, with default + and - operations out-of-line
functions that do clamping and are defined appropriately on NS_MAXSIZE. Then we
try to reduce the size/speed performance hit by replacing selected calls to + or
- with 'unsafe' functions that are inlined adds/subtracts. Or we could factor
out common idioms (e.g., discovering the offset between a frame and its
ancestor) into specialized optimized functions.
Does anyone recall why we don't just use floating point coordinates in layout? I
thought I knew a good reason, but I can't remember it right now. It would fix
the overflow problem rather cleanly.

Comment 4

13 years ago
Weren't there issues with round-off?  Not that we don't have plenty of those
with integer coordinates...
Assuming IEEE 'float' (a safe bet, I think), we get a 23 bit mantissa (not
including sign), which means we can represent +/- 2^23 twips precisely, so we'd
be precise up to about 500,000 pixels. I'm assuming we'd make sure our units
code continues to round everything to the nearest twip (so we *don't* say 1cm =
3210.33191371113302 twips or whatever, just 3210).

There may be residual issues with intermediate computations and so forth, but I
suspect fixing them is probably a lot easier than dealing with the possibility
of overflow everywhere.
In other words, if we do it right, we should be confident we're not changing
behaviour for pages with coordinates limited to the range of +/- 500K pixels.
There might be performance issues, but it could cut both ways:
-- integer operations are faster than floating point on old chips, but
-- not necessarily so on newer chips, plus
-- floating point uses different functional units, potentially freeing up
integer units to do other work in parallel
-- floating point values go in floating point registers, potentially relieving
register pressure in the general purpose registers
-- we might add floating point/integer conversions, which can be expensive, but
NSIntPixelsToTwips and related functions already do int->float->int conversion,
so we might reduce conversions as well
-- I have no idea about code size. On x86, FP instructions are slightly longer,
but we might gain a little from reducing spill code.

Comment 8

13 years ago
Have to be careful when dividing and multiplying.  Right now those naturally
round to the nearest twip...

In any case, the only other counter-argument I've heard when I've asked this
question is that floating-point math is slower.  I'm not sure how true that is
on current processors and how the cost would compare to that of doing overflow
checks.

(Note that I'm just repeating what I recall being told when I asked this very
question.... I don't know enough about this sort of issue myself.)
> Have to be careful when dividing and multiplying.  Right now those naturally
> round to the nearest twip...

Good point ... that might motivate wrapping the float in a class.

One advantage of using floats is that we could make NS_MAXSIZE be IEEE Positive
Infinity, which has exactly the properties we want (e.g., Infinity - N == Infinity).
(In reply to comment #6)
> In other words, if we do it right, we should be confident we're not changing
> behaviour for pages with coordinates limited to the range of +/- 500K pixels.

when using double (PRFloat64?), that range would increase by a lot; although
that may not matter... I don't know the perf issues with that...
That would make us use a *lot* more memory.

One issue we'd run into with floats is what to do when you have to convert an
nscoord back into a float, which happens in many places. I guess this is where
overflow checking has to happen. A lot better than doing it everywhere in
layout, I think.
Blocks: 265027
Created attachment 162805 [details] [diff] [review]
prototype nscoord implementation with test code
Here are some issues I ran into, with gcc 3.3 -O3:

-- it's not good at following optimizations through int-to-float conversion. For
example I originally had something like
PRBool operator<(nscoord a, PRInt32 b) { return a < nscoord(b); }
then I have test code that does "nscoord x; ... (x < 0)". This was being turned
into a runtime int-to-float conversion, which is baaaad. I found I could get
around this (in gcc 3.3 at least) by throwing out all my PRInt32 methods and
letting gcc use the float methods instead, doing the int-float conversion
implicitly for me. That made the conversion happen at compile time.
Unfortunately it created ambiguities while I still had an implicit
convert-to-PRInt32 operator in nscoord --- operator<(nscoord, float) and
operator<(PRInt32, int) are ambiguous when applied to (nscoord, int),
apparently. So I had to take out the implicit convert-to-PRInt32 operator.

-- I originally had the construct-nscoord-from-float code do truncf on the
incoming value to ensure it was never fractional. But gcc 3.3 was not doing
constant propagation through truncf so we ended up with a lot of runtime truncfs
on constants. So I made it an assertion.

-- Adding -funsafe-math-optimizations made a big difference, shortening
instruction paths by a few instructions. I don't know if we can safely turn this
on, either just in layout or globally. According to the gcc manual it may cause
gcc to link with something special that sets the state of the floating point
units (e.g., changing exception behaviour). I don't know what it actually does
on the platforms that matter to us.

-- nscoords can be passed by value (good) but return is always via an indirect
pointer (bad), But not too bad since we return points and rects a lot more than
single coords AFAIK, and they're already going indirectly.

-- No constant folding over nscoords is being performed. That probably doesn't
matter to us.

-- I tried tree-ssa. In gcc 3.3 the code it produces is entirely incorrect :-)

Here's some annotated code:
nscoord do_const1(nscoord x) {
  nscoord z(10);
  return z;
}
becomes
        pushl   %ebp
        movl    %esp, %ebp
        movl    8(%ebp), %eax
        movl    $0x41200000, %ecx
        movl    %ecx, (%eax)
        leave
        ret     $4

void foobar1(nscoord x) {
  if (x > 0) {
    printf("ZOUNDS!\n");
  }
}
becomes
        pushl   %ebp
        movl    %esp, %ebp
        flds    8(%ebp)
        fcomps  .LC14
        fnstsw  %ax
        testb   $69, %ah
        je      .L41
        leave
        ret
.L41:
        movl    $.LC15, 8(%ebp)
        leave
        jmp     puts
.LC14:
        .long   0

nscoord fooadder(nscoord x, nscoord y, nscoord z) {
  return x + y + z;
}
becomes
        pushl   %ebp
        movl    %esp, %ebp
        flds    16(%ebp)
        fadds   12(%ebp)
        movl    8(%ebp), %eax
        fadds   20(%ebp)
        fstps   (%eax)
        leave
        ret     $4

nscoord total_blah(int count, nscoord* elems, PRBool* extras, nscoord delta) {
  nscoord sum;
  for (int i = 0; i < count; ++i) {
    sum += elems[i];
    if (extras[i]) {
      sum += delta;
    }
  }
  return sum;
}
becomes
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %esi
        pushl   %ebx
        xorl    %ecx, %ecx
        movl    12(%ebp), %ebx
        cmpl    %ebx, %ecx
        movl    8(%ebp), %eax
        movl    20(%ebp), %esi
        flds    24(%ebp)
        jge     .L66
        movl    16(%ebp), %edx
.L63:
        flds    (%edx)
        fadds   (%eax)
        cmpl    $0, (%esi,%ecx,4)
        fsts    (%eax)
        je      .L67
        fadd    %st(1), %st
        fstps   (%eax)
.L58:
        incl    %ecx
        addl    $4, %edx
        cmpl    %ebx, %ecx
        jl      .L63
.L66:
        fstp    %st(0)
        popl    %ebx
        popl    %esi
        leave
        ret     $4
.L67:
        fstp    %st(0)
        jmp     .L58

I will try gcc 3.4 soon.

In summary I think things are not *too* bad with these settings. It would be
very important to retry this experiment with VC++ of course. There's still the
question of how much work it would be to convert all the code over, especially
without an implicit coord-to-PRInt32 conversion. 
There is another problem in the last function: 'sum' is not kept in a register.
I wonder if gcc thinks that the function result pointer could be aliased with
'elems', so it insists on storing sum every time.
gcc 3.4.2:

-- -funsafe-math-optimizations is less needed because they use a different code
sequence for conditionals on floating point values, but it still helps
-- keeps 'sum' in a register in total_blah() (it still stores to the memory
location every time through the loop, but that hardly matters)

The other comments still hold, except
> -- No constant folding over nscoords is being performed.
I was wrong, even gcc 3.3 is doing this. Maybe not as aggressively as for
integers, it's hard to tell.
Created attachment 162852 [details] [diff] [review]
revised testcase

This thing actually runs. In gcc 3.4 the code in main() actually turns out
pretty well (it inlines total_blah). Accessing positive infinity in foobar3
looks okay, but I suspect could be better.

	pushl	%ebp
	movl	$2139095040, %edx
	movl	%esp, %ebp
	subl	$8, %esp
	movl	%edx, -4(%ebp)
	flds	8(%ebp)
	flds	-4(%ebp)
	fcompp
	fnstsw	%ax
	sahf
	je	.L64
	leave
	ret
.L64:
	movl	$.LC17, (%esp)
	call	puts
	leave
	ret
-funsafe-math-optimizations is known to break the JS engine, so we shouldn't be
turning it on globally, and possibly not even in layout unless we're careful to
reset the FPU mode before entering JS code.

Comment 19

13 years ago
There are some cases that multiply nscoords so you might consider putting in a
non-truncating version to multiply nscoords with each other (rather than with a
float whose decimal portion you keep but you want to truncate the result).
Does it actually change the FPU mode, on x86, or just the generated code? Where
would this be documented?
Sorry, I'm thinking of -ffast-math (see bug 261618 comment 4), which may not
reset the mode, indeed.
-ffast-math does several things include turn on -funsafe-math-optimizations. Hrm.
But it looks like in that bug, the problem was -ffinite-math-only, which we
definitely don't want here. I'll make a build with -funsafe-math-optimizations
and see how it goes.
It seems that at our current optimization levels, not all calls through the
overloaded nscoord operators get inlined. We could twiddle the code and
optimization levels to try to get closer to what we want, but that's very
fragile. Currently I think the nscoord wrapper class is not an option. I'm going
to have a look at just typedefing nscoord to float and dealing with the fallout.
Created attachment 163907 [details] [diff] [review]
starting point

I'd like to try doing this incrementally. Reserving nscoords for twips
coordinates and using PRInt32 for pixel counts is good cleanup anyway, IMHO.
I'd like to get nscoord separated enough from PRInt32 that eventually flipping
the switch between PRInt32 and float for nscoord is easy.

This patch is the first step. It introduces the switch and makes
nsUnitConversion and other facilities depend on the switch. It also introduces
nsIntRect, nsIntPoint, nsIntSize and nsIntMargin structures to be used for
rects, points, sizes and margins that are measured in pixels not twips.

After this I will submit patches converting various modules and interfaces to
use the right type for the right units. It's easy to see that as long as the
switch is commented off, changing between nsFoo and nsIntFoo, or PRInt32 and
nscoord, doesn't change any behavior.
Attachment #163907 - Flags: superreview?(bzbarsky)
Attachment #163907 - Flags: review?(bzbarsky)

Updated

13 years ago
Blocks: 266360

Updated

13 years ago
Blocks: 266445

Comment 26

13 years ago
Comment on attachment 163907 [details] [diff] [review]
starting point

>Index: gfx/public/nsCoord.h

>+inline nscoord NSCoordMultiply(nscoord aCoord, float aVal) {
>+#ifdef NS_COORD_IS_FLOAT

Want to VERIFY_COORD here?  Likewise for the other methods that take nscoord
that you introduced?

>+inline PRInt32 NSCoordToInt(nscoord aCoord) {
>+  if (aCoord < -2147483647.5f) {
...
>+  } else if (aCoord > 2147483646.5f) {

Some documentation of what these magic values are would be good (can we get
them by casting PR_INT32_MIN & company to floats?).

Also, I didn't check that the numbers you have there have low enough precision
to be represented as a float without rounding.	I assume you checked that.

>Index: xpcom/ds/nsUnitConversion.h

> inline nscoord NSToCoordRound(float aValue)
> {
>+#ifdef NS_COORD_IS_FLOAT
>+  return floorf(aValue + ROUND_CONST_FLOAT);
>+#else

I guess we're assuming we can't use rint() or nearbyintf() or any thing like
that?

> inline PRInt32 NSToIntFloor(float aValue)
> {
>+#ifdef NS_COORD_IS_FLOAT
>+  return floorf(aValue);

Why does this depend on what nscoord is?  There are no nscoords in sight here.

> inline PRInt32 NSToIntCeil(float aValue)

Same.

> inline nscoord NSFloatPixelsToTwips(float aPixels, float aTwipsPerPixel)
> {
>+#ifdef NS_COORD_IS_FLOAT
>+  nscoord r = aPixels * aTwipsPerPixel;
>+#else
>+  nscoord r = NSToCoordRound(aPixels * aTwipsPerPixel);
>+#endif
>+  VERIFY_COORD(r);

So... if aPixels is some random float (which it could be, here), and
aTwipsPerPixel is some integer-valued float (which we enforce), you get some
random float in r.  There's no reason it'll be a valid coord (integer-valued
float).  I think you need NSToCoordRound() in both cases... or floorf, or
something.

r+sr=bzbarsky with these issues fixed.
Attachment #163907 - Flags: superreview?(bzbarsky)
Attachment #163907 - Flags: superreview+
Attachment #163907 - Flags: review?(bzbarsky)
Attachment #163907 - Flags: review+
(In reply to comment #26)
> (From update of attachment 163907 [details] [diff] [review])
> >Index: gfx/public/nsCoord.h
> 
> >+inline nscoord NSCoordMultiply(nscoord aCoord, float aVal) {
> >+#ifdef NS_COORD_IS_FLOAT
> 
> Want to VERIFY_COORD here?  Likewise for the other methods that take nscoord
> that you introduced?

Sure.

> >+inline PRInt32 NSCoordToInt(nscoord aCoord) {
> >+  if (aCoord < -2147483647.5f) {
> ...
> >+  } else if (aCoord > 2147483646.5f) {
> 
> Some documentation of what these magic values are would be good (can we get
> them by casting PR_INT32_MIN & company to floats?).

Yeah, (float)PR_INT32_MAX + 0.5f and (float)PR_INT32_MIN - 0.5f

> Also, I didn't check that the numbers you have there have low enough precision
> to be represented as a float without rounding.	I assume you checked
> that.

They cannot be represented exactly as floats. So yeah, this could blow up in
some edges cases. Hmm. Turns out the largest 32-bit signed integer representable
as a float is 2147483520, and the smallest is -2147483648. So I'll turn it into
> and < tests for those two values.

> >Index: xpcom/ds/nsUnitConversion.h
> 
> > inline nscoord NSToCoordRound(float aValue)
> > {
> >+#ifdef NS_COORD_IS_FLOAT
> >+  return floorf(aValue + ROUND_CONST_FLOAT);
> >+#else
> 
> I guess we're assuming we can't use rint() or nearbyintf() or any thing like
> that?

I looked at a copy of VC++'s math.h, it's rather impoverished. They don't have
many ...f functions.

> > inline PRInt32 NSToIntFloor(float aValue)
> > {
> >+#ifdef NS_COORD_IS_FLOAT
> >+  return floorf(aValue);
> 
> Why does this depend on what nscoord is?  There are no nscoords in sight here.
> 
> > inline PRInt32 NSToIntCeil(float aValue)
> 
> Same.

Mistake. I'll remove it.

> > inline nscoord NSFloatPixelsToTwips(float aPixels, float aTwipsPerPixel)
> > {
> >+#ifdef NS_COORD_IS_FLOAT
> >+  nscoord r = aPixels * aTwipsPerPixel;
> >+#else
> >+  nscoord r = NSToCoordRound(aPixels * aTwipsPerPixel);
> >+#endif
> >+  VERIFY_COORD(r);
> 
> So... if aPixels is some random float (which it could be, here), and
> aTwipsPerPixel is some integer-valued float (which we enforce), you get some
> random float in r.  There's no reason it'll be a valid coord (integer-valued
> float).  I think you need NSToCoordRound() in both cases... or floorf, or
> something.

For some reason I assumed that aPixels was an integral number of pixels. I'll
NSToCoordRound in both cases.

Comment 28

13 years ago
> So I'll turn it into > and < tests for those two values.

Sounds good, with a comment about why those values were chosen.

Comment 29

13 years ago
Is there a reason to allow coords to go out to 2^32ish given that roundoff
errors start happening past 2^24?
We certainly want to allow nscoords to be any size. That's the main point of
moving to floats. We might get underflow errors in some situations of the form A
+ B == A, but probably the only damage they will cause is possible infinite
loops in some algorithms, which we can detect and prevent quite easily ... much
more easily than doing integer overflow checking everywhere.

The only time you should need to convert an nscoord to an integer is when you're
converting to a device pixel value. So actually we should make the
twips-to-ints/points conversions use NSCoordToInt and there should be few or no
other callers to NSCoordToInt. In almost all cases gigantic pixel offsets will
either be clipped to some known bound or run into some implementation-specific
limit on the size of widgets or whatever.
checked in.
Created attachment 164342 [details] [diff] [review]
fix NSTwipsToIntPixels

NSTwipsToIntPixels needs to go through NSCoordToInt to make sure out-of-bounds
values become the correct integer values. This doesn't change the behaviour for
PRInt32 nscoords.
Comment on attachment 164342 [details] [diff] [review]
fix NSTwipsToIntPixels

minor fixup in nsUnitConversion
Attachment #164342 - Flags: superreview?(bzbarsky)
Attachment #164342 - Flags: review?(bzbarsky)

Comment 34

13 years ago
Robert will
http://lxr.mozilla.org/seamonkey/source/layout/html/table/src/nsTableFrame.cpp#314
use your magic code?

Comment 35

13 years ago
Comment on attachment 164342 [details] [diff] [review]
fix NSTwipsToIntPixels

r+sr=bzbarsky
Attachment #164342 - Flags: superreview?(bzbarsky)
Attachment #164342 - Flags: superreview+
Attachment #164342 - Flags: review?(bzbarsky)
Attachment #164342 - Flags: review+
bernd: that should continue to just work
(What's the relationship between this and the pixel bug? bug 177805.)
They're orthogonal, although the cleanup following here may make that bug easier
to fix.
Created attachment 165092 [details] [diff] [review]
remove nscoord from imglib interfaces and implementation

libpr0n code always deals with raw pixel counts, because that's what images are
about. Make that explicit by converting nscoord and nscoord-based structures to
PRInt32.
Attachment #165092 - Flags: superreview?(tor)
Attachment #165092 - Flags: review?(tor)

Comment 40

13 years ago
Created attachment 165232 [details] [diff] [review]
quiet compiler

the current NSCoordToFloat causes this:
r:\mozilla\rel-i586-pc-msvc\dist\include\gfx\nsCoord.h(153) : warning C4244:
'return' : conversion from 'nscoord' to 'float', possible loss of data
for each and every file that includes nsCoord.h. This is very annoying.

Updated

13 years ago
Attachment #165232 - Flags: superreview?(roc)
Attachment #165232 - Flags: review?(roc)

Updated

13 years ago
Attachment #165232 - Flags: superreview?(roc)
Attachment #165232 - Flags: superreview+
Attachment #165232 - Flags: review?(roc)
Attachment #165232 - Flags: review+

Comment 41

13 years ago
roc, I assume you'll fix the non-imagelib implementations of imgIDecoderObserver
and imgIContainerObserver to use nsIntRect in a separate patch?
yeah

Comment 43

13 years ago
Comment on attachment 165092 [details] [diff] [review]
remove nscoord from imglib interfaces and implementation

Since you're not making binary compatiblity changes, the UUIDs for
the interfaces can remain the same.  Once nscoord is changed, though,
everything using that type or derived types will need to change id.

It appears that gfxImageFrame.{h,cpp} were missed in this sweep.
Attachment #165092 - Flags: superreview?(tor)
Attachment #165092 - Flags: superreview+
Attachment #165092 - Flags: review?(tor)
Attachment #165092 - Flags: review+

Updated

13 years ago
Blocks: 53995
No longer blocks: 53995

Updated

13 years ago
Blocks: 53995
No longer blocks: 53995
Checked in patch 165092

Comment 45

13 years ago
Sorry, for lameness, but here is the question:
can we get some additional benefit from those changes, if platform uses float
natively?
(In BeOS, for example, all drawing happens if floating coordinate system, all
native bounds are reported in float format too. So in the port permanent
coordinate transformation from float to int and back is happening at each step)
Yes. Cairo uses floats natively too. I think we'll get some benefit by avoiding
integer-float conversions (which are relatively expensive on modern processors).
Blocks: 293940

Comment 47

12 years ago
Comment on attachment 165232 [details] [diff] [review]
quiet compiler

2004-11-08 20:57:
mozilla/gfx/public/nsCoord.h	1.7
Attachment #165232 - Attachment is obsolete: true

Updated

11 years ago
Blocks: 323495

Comment 48

11 years ago
Created attachment 259974 [details]
A stab at what might be needed

I've been watching my computer compute a lot in the last while, so I thought I would take a stab at looking into this bug...

The attachment is a rather large patch which is what is needed to get the TRUNK building with a struct instead of a typedef (sorry I had to gzip it).  It's a bit larger than it needs to be since I made the constructor explicit, so it needs lots of "nscoord(0)".  Anyway - these are the issues and things which I found:

1. You can't put a struct (with a constructor) in a union.  There are a few places where this is being done to save memory.  There's also one place where the nscoord is being squeezed into 29bits, which wont work for a float.

2. There are a number of different constants being used for invalid coords.  It would be best to define PR_INT32_MIN as NSCOORD_INVALID and use that instead of -1, 0xdeadbeef, NSCOORD_MIN and a few others.  There are also a few other magic values, including one in an enum, which doesn't work as a struct.

3. The patch has lots of rounding problems.  Particularly, I replaced all Coord/2 calls with NSCoordDivide(Coord,2), which rounds...  But this does indictate a number of places where there are problems and wrong assumptions.  For floats the divisions are a problem if you want to keep nscoords as integers.  I made a few mistakes also when I was starting out.  The next thing I plan on doing is going over the patch and fixing all of the rounding.

4. A few more helper functions could really be useful.  Particularly ones which convert a AppUnit nsRect into a Pixel nsIntRect, and back.  Also for nsSize, nsMargin, nsPoint, etc.  Other ones are a float ratio, a multiply/divide scale fuction, and functions to snap to pixel boundaries.  I'll work on those while looking at the rounding.

5. There are lots of functions which take 4 separate args when they could take a rect, or two for a point/size.

6. I found a few bugs in coord transforms.  I'll open separate bugs on those since the fixes are fairly obvious.

7. The mCursor bounds seems to be used as both AppUnits and pixels?  I've not finished digging into this.  But this is a great place to hide bugs...

8. The Thebes region code should be rewritten to use cairo/pixman regions?

9. A seperate type is needed to for integer device pixels.  This is so conversions from CSS pixels to device pixels are ensured.  This also means gfxIntRect, etc.

roc, I hope you don't mind the bug spam.
  -Jeremy
WOW!!! This is awesome!

I haven't looked at it all, but here's a couple of questions:

-    nsRect   windowRect(aX, aY, aWidth, aHeight);
+    nsRect   windowRect = nsRect(nscoord(aX), nscoord(aY), nscoord(aWidth), nscoord
(aHeight));

Why have you made changes like this --- converting direct construction to copy construction?

The big question is this: have you verified that with reasonable optimization levels, say -O2 on gcc, the code generated by your nscoord is at least as good as what we currently have? In particular if nscoord values aren't placed in registers we're dead in the water. See the comments above in this bug. We would also need to test MSVC++, although I can do that for you if necessary.

If that works out, I'm gung-ho about this patch. A patch of this size will rot fast so I'll try to make this process go as fast as I can.

> The attachment is a rather large patch which is what is needed to get the TRUNK
> building with a struct instead of a typedef (sorry I had to gzip it).  It's a
> bit larger than it needs to be since I made the constructor explicit, so it
> needs lots of "nscoord(0)".

You *might* be able to hack something up such as

  nscoord& operator=(struct __something_we_never_define* aDummy) {
     _v = 0;
     return *this;
  }

Hopefully that means you can write "nscoord x = 0;" but not "nscoord x = 1;".

I think it was a good call to not have a general implicit conversion.

> Anyway - these are the issues and things which I found:
> 
> 1. You can't put a struct (with a constructor) in a union.  There are a few
> places where this is being done to save memory.  There's also one place where
> the nscoord is being squeezed into 29bits, which wont work for a float.

OK, we should probably just make those to PRUint32 and convert backwards and forwards.

> 2. There are a number of different constants being used for invalid coords.  It
> would be best to define PR_INT32_MIN as NSCOORD_INVALID and use that instead of
> -1, 0xdeadbeef, NSCOORD_MIN and a few others.  There are also a few other magic
> values, including one in an enum, which doesn't work as a struct.

OK, that sounds fixable.

> 3. The patch has lots of rounding problems.  Particularly, I replaced all
> Coord/2 calls with NSCoordDivide(Coord,2), which rounds...

We probably want a truncating division instead, to preserve existing behaviour.

> But this does
> indictate a number of places where there are problems and wrong assumptions. 
> For floats the divisions are a problem if you want to keep nscoords as
> integers.  I made a few mistakes also when I was starting out.  The next thing
> I plan on doing is going over the patch and fixing all of the rounding.

It'd be best to not change rounding behaviour in this patch --- it's already big enough :-). Just preserve it and if you think the behaviour might be wrong, add an XXX comment --- say "XXX ROUNDING" --- that we can search on later.

> 4. A few more helper functions could really be useful.  Particularly ones which
> convert a AppUnit nsRect into a Pixel nsIntRect, and back.  Also for nsSize,
> nsMargin, nsPoint, etc.  Other ones are a float ratio, a multiply/divide scale
> fuction, and functions to snap to pixel boundaries.  I'll work on those while
> looking at the rounding.

Yeah, that sounds great if it shrinks the size of this patch. If you find that it isn't shrinking the size of this patch, don't do it now.

> 5. There are lots of functions which take 4 separate args when they could take
> a rect, or two for a point/size.

Ditto. Also, if there are changes of that sort that we can make right away, that would simplify this patch, you could split them off and do them now. Also, this patch changes a bunch of things from nsSize/Rect/Point to nsIntSize/Rect/Point --- we could split those out and land them early to reduce the size of the patch.

> 6. I found a few bugs in coord transforms.  I'll open separate bugs on those
> since the fixes are fairly obvious.

Great

> 7. The mCursor bounds seems to be used as both AppUnits and pixels?  I've not
> finished digging into this.  But this is a great place to hide bugs...

Yeah, which is why I'd like to have this patch.

> 8. The Thebes region code should be rewritten to use cairo/pixman regions?

I don't think so --- pixman regions are limited to 16-bit coordinates, which is really lame. Anyway, nsIRegion and its implementations are very little used in the current codebase, and with a bit of effort we could make them go away completely, so we should not spend unnecessary effort on them.

> 9. A seperate type is needed to for integer device pixels.  This is so
> conversions from CSS pixels to device pixels are ensured.  This also means
> gfxIntRect, etc.
> 
> roc, I hope you don't mind the bug spam.

I'd like to receive this kind of "bug spam" every day :-).

Even if we don't switch to floating-point layout, having this struct would be very useful for catching/debugging integer overflows, which is becoming more important.

Comment 50

11 years ago
(In reply to comment #49)
> > The attachment is a rather large patch which is what is needed to get the TRUNK
> > building with a struct instead of a typedef (sorry I had to gzip it).  It's a
> > bit larger than it needs to be since I made the constructor explicit, so it
> > needs lots of "nscoord(0)".
> 
> You *might* be able to hack something up such as
> 
>   nscoord& operator=(struct __something_we_never_define* aDummy) {
>      _v = 0;
>      return *this;
>   }
> 
> Hopefully that means you can write "nscoord x = 0;" but not "nscoord x = 1;".

No, that's an explicit constructor.

Without reading the patch I'm guessing that Jeremy's referring to passing a number as a parameter to a method that expects an nscoord. If you write a constructor along the lines of your assignment operator then you would be able to write foo(0) instead of foo(nscoord(0)).

Comment 51

11 years ago
One more thing I forgot to mention before.  This patch would bitrot a lot of the debug code, because you can't pass a struct to printf...  I've converted all of the current debug code, but I'm not going to try enable every possible debug option and fix everyone's code...

(In reply to comment #49)
> Why have you made changes like this --- converting direct construction to copy
> construction?

Mostly because I fixed the first compile error, then found that the function signature was wrong, then fixed it, then didn't fix the first fix right... ;-)

> The big question is this: have you verified that with reasonable optimization
> levels, say -O2 on gcc, the code generated by your nscoord is at least as good
> as what we currently have? In particular if nscoord values aren't placed in
> registers we're dead in the water. See the comments above in this bug. We would
> also need to test MSVC++, although I can do that for you if necessary.

I haven't tested this at all...  My first priority has been to get it to compile, and to try to work out the various ins and outs of whats being done where, and to look at some of the rounding issues.

However, if you really wanted to be evil, then the simple solution here is to only use the struct in the debug case, and put in a bunch of asserts about not overflowing, or using unconstrained sizes and those sorts of things.  But use a typedef for the optimized builds.  Then throw lots of big coords and other junk at a debug build via the test suit, and put some real checks in at the places where coords can actually overflow, since these are probably limited.  I suspect the list consists of CSS->appunit conversions, long lines of text, and some places where inner boxes are added up to get the top width (tables and frames).  The various paths should show up quickly in assert backtraces, if a quick look through bugzilla doesn't show them already...

> If that works out, I'm gung-ho about this patch. A patch of this size will rot
> fast so I'll try to make this process go as fast as I can.

I'm happy to keep this up to date.  It's a build that I'm not using and it's on a machine that is mostly idle...  And I'll have some time to kill while I wait for some big finite element models to run...

>   nscoord& operator=(struct __something_we_never_define* aDummy) {
>      _v = 0;
>      return *this;
>   }

This works great!  Thanks for the tip.

> > 1. You can't put a struct (with a constructor) in a union.  There are a few
> > places where this is being done to save memory.  There's also one place where
> > the nscoord is being squeezed into 29bits, which wont work for a float.
> 
> OK, we should probably just make those to PRUint32 and convert backwards and
> forwards.

This would be another advantage to using the evil trick above.  You could use the typedef in a union, or have it be a struct for debug builds, where memory isn't an issue.  There are four of these.  Two in the style system, where they appear to be memory savers (nsROCSSPrimativeValue and nsStyleStruct), one in the table code (BCVerticalSeg), although that's not really saving much, and one introduced recently in the the table computations which appears to just be saving stack space.

> Yeah, that sounds great if it shrinks the size of this patch. If you find that
> it isn't shrinking the size of this patch, don't do it now.

These will shrink it considerably, and make it easier to read...

> Ditto. Also, if there are changes of that sort that we can make right away,
> that would simplify this patch, you could split them off and do them now. Also,
> this patch changes a bunch of things from nsSize/Rect/Point to
> nsIntSize/Rect/Point --- we could split those out and land them early to reduce
> the size of the patch.

I think that this probably the best place to start with landing this.  I'll go through my big patch and make up some smaller patches which just have these changes, which at the moment are noops, and then they can be landed before any playing with the coordinate stuff.

Regards,
  -Jeremy
> However, if you really wanted to be evil, then the simple solution here is to
> only use the struct in the debug case

I suppose that's an option. It would be nice to avoid though, because there's a risk of making debug build behaviour diverge from opt build behaviour in subtle ways. (That risk is already present for other reasons, of course, so this is not a deal-breaker.)

Comment 53

10 years ago
Sorry I've been so quite on this...  I've been working on a big issue I uncovered by driving a few more of the forced conversions down.  The problem is that all  of the SVG code uses the nsRect mRect which is a member of nsIFrame, and the nsIFrame methods - but it uses pixels in all of its code.  This is really unhealthy.  I'm trying to figure out what to do about this...  Should the SVG code convert all of these to nscoords or should the SVG code be made to use a nsIntRect and leave the nsFrame interfaces alone?  I'm not sure.  The first means lots of conversions.  The second means that the nsIFrame interfaces wont be correct (well less correct).

I've tracked down the one issue with cursor coordinates.  I'll try to work on a separate patch for that soon.  The coordinates are being placed into an nsRect in pixels and fixed up later in the event handler.

Regards,
  -Jeremy
An ugly but maybe good-enough solution might be to define a method in SVG frames like

    nsIntRect& GetSVGRect() {
        return *NS_REINTERPRET_CAST(nsIntRect*, &mRect);
    }

and use that everywhere in SVG
tor, see last couple of comments.

Comment 56

10 years ago
I'm not sure I see the problem - SVG doesn't currently participate in the normal reflow mechanism, so we reused the mRect for our purposes on SVG leaves (and svg:fO) instead of having our own rect and wasting memory.  With your patch it appears that nsRect can still store the data we need, and the number of conversions needed in the SVG code appear small at first glance.

Comment 57

10 years ago
(In reply to comment #56)
> I'm not sure I see the problem - SVG doesn't currently participate in the
> normal reflow mechanism, so we reused the mRect for our purposes on SVG leaves
> (and svg:fO) instead of having our own rect and wasting memory.  With your
> patch it appears that nsRect can still store the data we need, and the number
> of conversions needed in the SVG code appear small at first glance.
> 

If (when?) nscoord is changed to a float, then you might encounter problems with float<->int conversion (besides wasting CPU cycles).  Looking through the uses, the mRect is mostly passed to InvalidateRect, which converts it to app units and passes it to Invalidate.  So you can just do the conversion up front, and not abuse the mRect...

Comment 58

10 years ago
Bug 379616 switches SVG frames' use of mRect to app units.

Updated

10 years ago
Summary: Need to deal with integer overflows in reflow → Need to deal with integer overflows in reflow (consider making nscoords be floats)

Updated

10 years ago
No longer blocks: 323495

Updated

10 years ago
Blocks: 323381

Updated

10 years ago
Blocks: 411319

Updated

8 years ago
Blocks: 391879
Assignee: roc → nobody

Updated

7 years ago
Blocks: 552412

Updated

7 years ago
Blocks: 575011

Updated

6 years ago
Duplicate of this bug: 183986
Duplicate of this bug: 748257
See Also: → bug 1296577
You need to log in before you can comment on or make changes to this bug.