Closed
Bug 909528
Opened 12 years ago
Closed 1 year ago
The "decimal" characteristic of ranges needs documentation
Categories
(Core :: JavaScript Engine, enhancement, P5)
Core
JavaScript Engine
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: Waldo, Unassigned)
References
(Blocks 1 open bug)
Details
Attachments
(1 file, 1 obsolete file)
nbp explained what this was supposed to mean to me, once. But I've since forgotten exactly what the explanation meant. And in any case, that's hardly helpful to anyone else trying to understand this by reading the code. :-)
Comment 1•12 years ago
|
||
Attachment #796800 -
Flags: review?(jwalden+bmo)
| Reporter | ||
Comment 2•12 years ago
|
||
Comment on attachment 796800 [details] [diff] [review]
Improve comment of range analysis, and rename decimal to fractional part.
Review of attachment 796800 [details] [diff] [review]:
-----------------------------------------------------------------
I'm going to be kind of ["kind of"? -ed.] picky about all this, and how it's formatted/structured. I think it's probably a good idea for us to take the small parts and run with them, then keep iterating on the rest in subsequent patches, til we reach a fully-good-overall state. (If I don't stop on this now, I'm never gonna. :-) )
Documentation is *hard*, yo.
::: js/src/jit/RangeAnalysis.cpp
@@ +862,5 @@
> ++exp;
> }
> setRange(new Range(l, h, (rest != 0), exp));
> } else {
> + // This double has a precision loss. This also mean that it is too large
means
::: js/src/jit/RangeAnalysis.h
@@ +103,2 @@
> // operations. Double has 52 bits of mantissa, so 2^52+1 cannot be
> // represented without loss.
This should be:
Double has 52 bits of mantissa (plus an implicit leading 1 bit), so 2**53 + 1 can't be represented exactly.
Also the line before should be "precision", not "precission".
@@ +108,5 @@
> static const uint16_t MaxDoubleExponent = mozilla::DoubleExponentBias;
>
> private:
> + // A range is a continuous set of numbers included between a lower bound and
> + // an upper bound. A range is used as a super-set which includes all output
superset
@@ +109,5 @@
>
> private:
> + // A range is a continuous set of numbers included between a lower bound and
> + // an upper bound. A range is used as a super-set which includes all output
> + // which can be returned by a MDefinition.
Reading straightly in-order I got confused by a few things here. I'm going to try to rewrite this, somewhat, so that it doesn't set flags when I read it. I'll try to list the things I'm trying to fix as I go.
"""
A range is the continuous set of numbers included between a lower bound and an upper bound. It represents all possible numbers which might be the value of some mathematical expression. Ranges are capable of holding any IEEE-754 number, including infinities and NaN. (The goal in practice, of course, is to compute the smallest range possible for an expression.)
"""
Range is conceptually separate from MDefinition -- there's no reason to drag it into the definition of, fundamentally, a mathematical data structure. So this shouldn't mention anything more than numbers, sets, ranges, etc.
I think this comment should move to above Range, outside the class definition. This sort of class-overview comment, as long as it's not referring to internal fields, best goes there. Field-by-field stuff, of course, like below, should stick here.
@@ +114,2 @@
> //
> + // * Int32:
I'm not sure I agree with the strategy of discussing how various kinds of data are encoded in a Range. What matters as far as this internal documentation goes, is how a Range with fields with particular values should be interpreted. For example: if |lower_| has value |x|, how does that affect the conceptual range being represented?
What we really want here, as documentation accompanying the fields, is a structurally-directed description of the set of values represented by any assignment of values to a Range's fields. (We can do the "how do I get a range that looks like so" part elsewhere -- in method and constructor documentation, I think.) So, then, following after the paragraph in the previous comment, how does this (following that tack) sound to you?
"""
The most fundamental components of a range are its int32_t lower_ and upper_. It is an invariant that lower_ <= upper_. These fields usually determine the lower and upper bounds of the represented range, *except* when |lower_ == INT32_MIN| or |upper_ = INT32_MAX|. Those values may represent those numeric bounds, but they may be overloaded to indicate that the range is "open" in that direction, extending beyond INT32_MIN or INT32_MAX. An open <dir>_ bound is indicated by |<dir>_infinite_ == true|. (It is an invariant that |lower_infinite_ == true| implies |lower_ == INT32_MIN|, and that |upper_infinite_ == true| implies |upper_ == INT32_MAX|.)
An open bound's extent is determined by max_exponent_, a conservative measure of the magnitude of all numbers in the range. All numbers in the range (whether closed, half-open, or fully open -- although in the first two cases, the actual bound is usually a better measure, and never worse) must have magnitude less than 2**(max_exponent_ + 1). (Why + 1? This is the exponent field from IEEE-754 representation, where an exponent encodes numbers up to the next greater power of two.) The sole exception is if max_exponent_ is so large that 2**(max_exponent_ + 1) exceeds IEEE-754 double's finite range -- when |max_exponent_ >= 1023|. This indicates that the bound extends to infinity in the relevant direction, and it adds NaN to the set of represented values.
The final component assigning mathematical meaning to a range is decimal_. If |decimal_ == false|, the set of numbers consists of integral values in the already-determined range. (Important: if 0 is in the range, -0 is included in the ultimate set! You must be careful to account for -0 as well as +0 in such cases.) If it is true, the range consists of all IEEE-754 numbers in that range -- including fractional values.
The remaining two fields, symbolicLower_ and symbolicUpper_, play no part in determining the mathematical value of a range. (They should perhaps move out of Range, actually, into the set of MIR nodes that contain Range*, at some point.)
"""
It may also be worth including examples of what various field-value assignments mean in terms of ultimate represented set, for extra clarity beyond the complex prose:
"""
Here are some examples of what various field assignments mean, for (lower_, lower_infinite_, upper_, upper_infinite_, hasFractionalPart_, max_exponent_):
(1, false, 4, false, false, 2)
{1, 2, 3, 4}
(1, false, 2, false, true, 1)
{1, 1 + 2**-52, 1 + 2**-51, 1 + 2**-51 + 2**-52, ..., 2 - 2**-52, 2}
(0, false, 5, false, true, 3)
all floating-point values from -0 to 5
(INT32_MIN, false, INT32_MAX, false, false, 32)
all int32_t values, and -0
(INT32_MIN, true, INT32_MAX, false, false, 34)
all integers greater than -2**(34 + 1) and less than or equal to INT32_MAX, and -0
(0, false, INT32_MAX, true, false, 48)
non-negative integers less than 2**49, and -0
(-17, false, INT32_MAX, true, false, 1023)
all integers from -17 to the largest integrally-precise double value; plus every larger, finite double value; plus +Infinity; plus -0
"""
@@ +128,5 @@
> + //
> + // 51: Range(51, 51)
> + // N[-2, 5]: Range(-2, 5)
> + //
> + // * UIn32:
These special-case kinds of ranges, I think, should be described by the methods that generate or act upon them. This location is fine for describing the meanings of the fields internally, but people should be looking to public method documentation to understand stuff.
@@ +142,5 @@
> + // set.
> + //
> + // N[0, 5]: Range(0, 5)
> + // N[0, 5]: Range(0, 5)
> + // N[0, 2^32]: Range(0, 2^32, false, 32) { and use extendUInt32ToInt32Min() }
I think you probably meant 2**32 - 1 in two places here.
@@ +149,5 @@
> + //
> + // A truncatable int is a double value which does not contain any fractional
> + // part and which exponent is small enough to fit on the mantissa of a
> + // double without any precision lost. The predicate "hasRoundingErrors"
> + // will return false if this number can be represented as truncatable int.
This bit should go next to hasRoundingErrors, roughly. Although I think it should be rephrased somewhat in terms of what a range that "has rounding errors" looks like: that it may contain non-integral numbers or may include the results of computations too large to be precisely represented as doubles.
@@ +187,5 @@
> int32_t lower_;
> bool lower_infinite_;
>
> int32_t upper_;
> bool upper_infinite_;
The *_infinite_ names are wrong. They indicate whether the relevant bound is open or not, but they *don't* indicate that infinity is a possible value. Could you rename these (in a followup if desired) to lower_open_ and so on, maybe?
Also, all these fields are not well packed. We should reorder them so they pack better.
@@ +189,5 @@
>
> int32_t upper_;
> bool upper_infinite_;
>
> + bool hasFractionalPart_;
What would you think to naming this integral_ and inverting its meaning? (Fine to do in a separate patch, of course.) That makes clear that only integers are part of the range in that case. "hasFractionalPart_" seems to claim somewhat too strongly that there's a fractional bit, to me, partly because of the "has" there. "mayHaveFractionalPart_" would be an improvement, too, I think.
@@ +305,1 @@
> inline bool isInt32() const {
All these methods need comments on them, I think, to explain exactly what they promise. More followup.
Attachment #796800 -
Flags: review?(jwalden+bmo)
Comment 3•12 years ago
|
||
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #2)
> @@ +187,5 @@
> > int32_t lower_;
> > bool lower_infinite_;
> >
> > int32_t upper_;
> > bool upper_infinite_;
>
> The *_infinite_ names are wrong. They indicate whether the relevant bound
> is open or not, but they *don't* indicate that infinity is a possible value.
> Could you rename these (in a followup if desired) to lower_open_ and so on,
> maybe?
Could I suggest lower_unbounded_? "open" has a different meaning in the context of numeric ranges.
| Reporter | ||
Comment 4•12 years ago
|
||
That's true, isn't it? :-) That works for me.
Comment 5•12 years ago
|
||
Attached is a patch for your consideration. It doesn't address the documentation issue, but it does clean up several of the Range API's concepts, which I hope will help.
- "Int32" concepts no longer include fractional values.
- All the Range class invariants are collected into one place and enforced consistently.
- It's clear when and where max_exponent_ should be set (\o/).
- The decimal->fractional rename and infinite->unbounded renames.
- It introduces new interfaces for dealing with unbounded ranges.
- And various minor bug fixes.
If this is interesting, I can break up the patch into multiple patches to simplify reviewing, if desired.
Comment 6•12 years ago
|
||
Comment on attachment 800523 [details] [diff] [review]
a bunch of changes
Review of attachment 800523 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/RangeAnalysis.h
@@ +369,5 @@
>
> + // Return the lower bound, or NegativeUnbounded if it's unbounded.
> + int64_t lowerOrUnbounded() const {
> + if (isLowerUnbounded())
> + return NegativeUnbounded;
We need to be extremely careful if we extend the range to int64, as we might risk to overflow while calculating the ranges.
Comment 7•12 years ago
|
||
Comment on attachment 800523 [details] [diff] [review]
a bunch of changes
This patch was significantly revised and submitted under bug 915846.
Attachment #800523 -
Attachment is obsolete: true
Comment 8•12 years ago
|
||
A quick update on the status here: lower_infinite_ and friends are renamed to hasInt32LowerBound_ and similar, and decimal_ and friends are renamed to canHaveFractionalPart_ and similar. The main work to be done here now is to finish the comment changes discussed in Comment 2 here.
| Assignee | ||
Updated•11 years ago
|
Assignee: general → nobody
Updated•3 years ago
|
Severity: minor → S4
Comment 9•2 years ago
|
||
It looks like good chunks of the patch on this bug have landed; I suspect we can just resolve this as fixed?
Flags: needinfo?(nicolas.b.pierron)
Comment 10•1 year ago
|
||
Well this bug is about adding a big fat comment to Range Analysis internal implementation of ranges, which might be difficult as ranges are precise on small integer and become approximations on double like values.
However, nobody complain in a while for the lack of documentation in this area of the code … possibly because of the large number of property checks added since, as well as the many named shortcuts for creating ranges of different kinds.
Also, this is no longer the point of interest anymore, and I do not see many new Range Analysis patches these days.
Thus, I will mark it as works-for-me in the mean time.
Status: NEW → RESOLVED
Closed: 1 year ago
Flags: needinfo?(nicolas.b.pierron)
Resolution: --- → WORKSFORME
Updated•1 year ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•