Closed Bug 1295941 Opened 8 years ago Closed 6 years ago

Add helpers to mozilla::Tokenizer to read negative integers

Categories

(Core :: XPCOM, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla59
Tracking Status
firefox59 --- fixed

People

(Reporter: mayhemer, Assigned: mayhemer)

Details

Attachments

(1 file, 2 obsolete files)

Now when the tokenizer hits an input like "-123" it's parsed as char('-') and integer(123).  But it might be better to support reading negative numbers too.

But I'm also thinking of reading range inputs like "100-200", which might rather read as integer(100), char('-'), integer(200) and not integer(100) and integer(-200).

We have a method ReadInteger that doesn't search for a '-' sign (only reads non-negative integers).  Hence, the option may be to introduce ReadSignedInteger method as a 'higher' method that simply consists of two operations: check for a '-' sign and an immediate integer after it.  If all of it fails, rollback and return false.

When here, we may also introduce ReadSignedFloat to also add a higher method for reading numbers in the -123.456 notation.
Priority: -- → P3
Status: NEW → ASSIGNED
Summary: Add helpers to mozilla::Tokenizer to read negative integers and float numbers → Add helpers to mozilla::Tokenizer to read negative integers
Attached patch v1 (obsolete) — Splinter Review
Should be self-explanatory.  I was trying to have just a single method (ReadInteger) with two implementations for signed and unsigned types using EnableIf but I'm not an expert to templates.  If you have an idea, please let me know.  I honestly don't even know if that is possible.  Thanks.
Attachment #8941586 - Flags: review?(nfroyd)
Comment on attachment 8941586 [details] [diff] [review]
v1

Review of attachment 8941586 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/ds/Tokenizer.h
@@ +393,5 @@
> +      mCursor = cursor;
> +      mHasFailed = true;
> +    });
> +
> +    bool minus = CheckChar('-');

I *think* this doesn't work if `mAdditionalWordChars` contains '-'. I'm guessing that's why I used the function version of `CheckChar` in the NSPR log module parsing code [1].

It would be nice if I could get rid of that oddity, but I understand if you don't want to support it. Either way it might be nice to add an explicit test for interaction with `mAddtionalChars`.

[1] https://searchfox.org/mozilla-central/rev/88439dc6c5b02e167032374071cdc697a056efa1/xpcom/base/NSPRLogModulesParser.cpp#33-37
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #2)
> Comment on attachment 8941586 [details] [diff] [review]
> v1
> 
> Review of attachment 8941586 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: xpcom/ds/Tokenizer.h
> @@ +393,5 @@
> > +      mCursor = cursor;
> > +      mHasFailed = true;
> > +    });
> > +
> > +    bool minus = CheckChar('-');
> 
> I *think* this doesn't work if `mAdditionalWordChars` contains '-'. I'm
> guessing that's why I used the function version of `CheckChar` in the NSPR
> log module parsing code [1].

Oh, it's a very good point, thanks!

> 
> It would be nice if I could get rid of that oddity, but I understand if you
> don't want to support it. 

I want to keep the tokenizer simple and clear.  Maybe we could have a CheckRawChar method or something, but it already brings some non-clarity why that should be used.

Let's leave it for now as it is, with the oddity we have instead of adding potentially just more oddities.

> Either way it might be nice to add an explicit
> test for interaction with `mAddtionalChars`.

Will do!

> 
> [1]
> https://searchfox.org/mozilla-central/rev/
> 88439dc6c5b02e167032374071cdc697a056efa1/xpcom/base/NSPRLogModulesParser.
> cpp#33-37
Attached patch v1.1 (obsolete) — Splinter Review
Fixed according comment 2 and updated the nspr log parser with the new API.   I also made this to build on linux and loosen the condition for the result integer be strictly unsigned when using the ReadInteger method, there is a lot of places passing in a signed int, but we are save thanks the CheckedInt usage!

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a5ab37fccd96f9295a97d9883b3080e65e4dc24c (no tests, just check it builds)
Attachment #8941586 - Attachment is obsolete: true
Attachment #8941586 - Flags: review?(nfroyd)
Attachment #8941853 - Flags: review?(nfroyd)
Comment on attachment 8941853 [details] [diff] [review]
v1.1

Review of attachment 8941853 [details] [diff] [review]:
-----------------------------------------------------------------

Eric, can you review this since you've already looked at the code a little bit?
Attachment #8941853 - Flags: review?(nfroyd) → review?(erahm)
Comment on attachment 8941853 [details] [diff] [review]
v1.1

Review of attachment 8941853 [details] [diff] [review]:
-----------------------------------------------------------------

r=me

::: xpcom/ds/Tokenizer.h
@@ +376,5 @@
>    /**
> +   * Same as above, but accepts an integer with an optional minus sign.
> +   */
> +  template <typename T,
> +            typename U = typename RemovePointer<T>::Type,

Seems like you shouldn't need `U`.
Attachment #8941853 - Flags: review?(erahm) → review+
Attached patch v1.2Splinter Review
Removed the U param, builds on both linux and win, tests pass.
Attachment #8941853 - Attachment is obsolete: true
Attachment #8943255 - Flags: review+
Keywords: checkin-needed
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7048261aef79
Let mozilla::Tokenizer read signed integers from the input, r=erahm
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/7048261aef79
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
I'm late to the party here, but: I think it's better to treat the '-' as a separate token and make the parser deal with it. 

- It works for cases like "- 1" where there is whitespace between the '-' and the number.

- It works for cases like "100-200" as mentioned in comment 0.

Otherwise, either we end up not handling one or both of those cases, or both the tokenizer and parser have to worry about negating integers.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: