Closed
Bug 672641
Opened 14 years ago
Closed 14 years ago
Implement CharTokenizer which takes arbitrary chars as delimiters
Categories
(Core :: DOM: Core & HTML, defect)
Core
DOM: Core & HTML
Tracking
()
RESOLVED
FIXED
mozilla9
People
(Reporter: hippopotamus.gong, Assigned: hippopotamus.gong)
References
Details
Attachments
(1 file, 3 obsolete files)
|
4.41 KB,
patch
|
hippopotamus.gong
:
review+
|
Details | Diff | Splinter Review |
Currently we only have nsCharSeperatedTokenizer and nsWhitespaceTokenizer, which have their specific use cases.
The nsCharTokenizer can take any char as a delimiter and tokenize the string based on that.
| Assignee | ||
Comment 1•14 years ago
|
||
Tested using the MozURL and tests passed.
Attachment #546903 -
Flags: review?(jonas)
Comment 2•14 years ago
|
||
Could you make that mozilla::{,C}CharTokenizer? (I guess a template wouldn't work here, because of the fake templating we do for our strings.)
| Assignee | ||
Comment 3•14 years ago
|
||
Attachment #546903 -
Attachment is obsolete: true
Attachment #546921 -
Flags: review?(jonas)
Attachment #546903 -
Flags: review?(jonas)
| Assignee | ||
Updated•14 years ago
|
Summary: nsCharTokenizer which takes arbitrary chars as delimiters → Implement CharTokenizer which takes arbitrary chars as delimiters
Comment on attachment 546921 [details] [diff] [review]
Changed to mozilla:CharTokenizer
Review of attachment 546921 [details] [diff] [review]:
-----------------------------------------------------------------
::: xpcom/ds/CharTokenizer.h
@@ +45,5 @@
> +
> +class CharTokenizer
> +{
> +public:
> + CharTokenizer(const nsSubstring& aSource, PRUnichar delimiter)
Make the delimiter a template argument rather than a constructor argument.
@@ +52,5 @@
> + aSource.BeginReading(mIter);
> + aSource.EndReading(mEnd);
> +
> + while (mIter != mEnd && isDelimiter(*mIter)) {
> + ++mIter;
This is not what you want. For a string like "&foo&bar" it'll just create two tokens, "foo" and "bar", whereas you want "", "foo" and "bar".
@@ +75,5 @@
> + ++mIter;
> + }
> + nsSubstring::const_char_iterator end = mIter;
> + while (mIter != mEnd && isDelimiter(*mIter)) {
> + ++mIter;
Again, this is not what you want. For a string like "foo&&bar" it'll create the two tokens "foo" and "bar", whereas you want "foo", "" and "bar".
Attachment #546921 -
Flags: review?(jonas) → review-
| Assignee | ||
Comment 5•14 years ago
|
||
Attachment #546921 -
Attachment is obsolete: true
Attachment #546960 -
Flags: review?(jonas)
Comment on attachment 546960 [details] [diff] [review]
Corrected the problems
Review of attachment 546960 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good
Attachment #546960 -
Flags: review?(jonas) → review+
Comment 7•14 years ago
|
||
Can we have some tests?
| Assignee | ||
Comment 8•14 years ago
|
||
(In reply to comment #7)
> Can we have some tests?
Definitely. I will put tests soon.
| Assignee | ||
Comment 9•14 years ago
|
||
(In reply to comment #7)
> Can we have some tests?
Hey Mounir, it turned out that creating new C++ unit tests with nsString is troublesome, so I will test the tokenizer's results in JavaScript in the related bug 668680.
Comment 10•14 years ago
|
||
Why is it troublesome? I mean, we should probably open a bug if we can't test that kind of stuff: we should be able to!
Comment 11•14 years ago
|
||
Comment on attachment 546960 [details] [diff] [review]
Corrected the problems
>--- /dev/null
>+++ b/xpcom/ds/CharTokenizer.h
>+#ifndef __CharTokenizer_h
>+#define __CharTokenizer_h
Please make that mozilla_CharTokenizer_h
>+#endif /* __CharTokenizer_h */
>+
>+
And get rid of these empty lines.
>diff --git a/xpcom/ds/Makefile.in b/xpcom/ds/Makefile.in
>--- a/xpcom/ds/Makefile.in
>+++ b/xpcom/ds/Makefile.in
>@@ -108,16 +108,17 @@ EXPORTS = \
>+ CharTokenizer.h \
Please add it to EXPORTS_mozilla instead
| Assignee | ||
Comment 12•14 years ago
|
||
Attachment #546960 -
Attachment is obsolete: true
| Assignee | ||
Comment 13•14 years ago
|
||
(In reply to comment #10)
> Why is it troublesome? I mean, we should probably open a bug if we can't
> test that kind of stuff: we should be able to!
Will look into it later
Updated•14 years ago
|
Assignee: nobody → shawn
Blocks: 668680
OS: Mac OS X → All
Hardware: x86 → All
Version: unspecified → Trunk
| Assignee | ||
Comment 14•14 years ago
|
||
Comment on attachment 548290 [details] [diff] [review]
Fixes according to Ms2ger's comments
Review+ by sicking
Attachment #548290 -
Flags: review+
Checked in to inbound:
http://hg.mozilla.org/integration/mozilla-inbound/rev/edb32fa29bca
Yay!
Comment 16•14 years ago
|
||
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla9
Comment 17•13 years ago
|
||
(In reply to Shawn Gong from comment #0)
> Currently we only have nsCharSeperatedTokenizer and nsWhitespaceTokenizer,
> which have their specific use cases.
>
> The nsCharTokenizer can take any char as a delimiter and tokenize the string
> based on that.
That sounds exactly like what nsCharSeperatedTokenizer does, aside from that nsCharSeperatedTokenizer does whitespace-skipping (which is configurable, since you can pass in a custom "is whitespace" method, and you could pass in a method that unconditionally returns false to prevent whitespace-skipping).
Aside from that, are there any other things that distinguish nsCharTokenizer from nsCharSeperatedTokenizer?
Comment 18•13 years ago
|
||
Actually, it looks like CharTokenizer isn't used at all, aside from a chunk of recently-added code in SVGFragmentIdentifier.cpp (as well as in the awaiting-review patch in bug 668680).
I suspect both of those could easily use a variant of nsCharSeperatedTokenizer, and this class CharTokenizer class could go away... I'm just worried that having two nearly-identically-named classes with different impls for the same use-case will cause confusion & increased surface for bugs to crop up. Happy to file a new bug on that; I'm just commenting here to check if I'm misunderstanding the use-cases / purpose of CharTokenizer, before doing that.
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•