Closed Bug 672641 Opened 14 years ago Closed 14 years ago

Implement CharTokenizer which takes arbitrary chars as delimiters

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla9

People

(Reporter: hippopotamus.gong, Assigned: hippopotamus.gong)

References

Details

Attachments

(1 file, 3 obsolete files)

Referenced from nsWhitespaceTokenizer.h 14 years ago Shawn Gong 4.90 KB, patch		Details \| Diff \| Splinter Review
Changed to mozilla:CharTokenizer 14 years ago Shawn Gong 4.93 KB, patch	sicking : review-	Details \| Diff \| Splinter Review
Corrected the problems 14 years ago Shawn Gong 4.46 KB, patch	sicking : review+	Details \| Diff \| Splinter Review
Fixes according to Ms2ger's comments 14 years ago Shawn Gong 4.41 KB, patch	hippopotamus.gong : review+	Details \| Diff \| Splinter Review

Shawn Gong

Assignee

Description

•

14 years ago

Currently we only have nsCharSeperatedTokenizer and nsWhitespaceTokenizer, which have their specific use cases. The nsCharTokenizer can take any char as a delimiter and tokenize the string based on that.

Shawn Gong

Assignee

Comment 1

•

14 years ago

Attached patch Referenced from nsWhitespaceTokenizer.h (obsolete) — Details — Splinter Review

Tested using the MozURL and tests passed.

Attachment #546903 - Flags: review?(jonas)

:Ms2ger (he/him; ⌚ UTC+1/+2)

Comment 2

•

14 years ago

Could you make that mozilla::{,C}CharTokenizer? (I guess a template wouldn't work here, because of the fake templating we do for our strings.)

Shawn Gong

Assignee

Comment 3

•

14 years ago

Attached patch Changed to mozilla:CharTokenizer (obsolete) — Details — Splinter Review

Attachment #546903 - Attachment is obsolete: true

Attachment #546921 - Flags: review?(jonas)

Attachment #546903 - Flags: review?(jonas)

Shawn Gong

Assignee

Updated

•

14 years ago

Summary: nsCharTokenizer which takes arbitrary chars as delimiters → Implement CharTokenizer which takes arbitrary chars as delimiters

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 4

•

14 years ago

Comment on attachment 546921 [details] [diff] [review] Changed to mozilla:CharTokenizer Review of attachment 546921 [details] [diff] [review]: ----------------------------------------------------------------- ::: xpcom/ds/CharTokenizer.h @@ +45,5 @@ > + > +class CharTokenizer > +{ > +public: > + CharTokenizer(const nsSubstring& aSource, PRUnichar delimiter) Make the delimiter a template argument rather than a constructor argument. @@ +52,5 @@ > + aSource.BeginReading(mIter); > + aSource.EndReading(mEnd); > + > + while (mIter != mEnd && isDelimiter(*mIter)) { > + ++mIter; This is not what you want. For a string like "&foo&bar" it'll just create two tokens, "foo" and "bar", whereas you want "", "foo" and "bar". @@ +75,5 @@ > + ++mIter; > + } > + nsSubstring::const_char_iterator end = mIter; > + while (mIter != mEnd && isDelimiter(*mIter)) { > + ++mIter; Again, this is not what you want. For a string like "foo&&bar" it'll create the two tokens "foo" and "bar", whereas you want "foo", "" and "bar".

Attachment #546921 - Flags: review?(jonas) → review-

Shawn Gong

Assignee

Comment 5

•

14 years ago

Attached patch Corrected the problems (obsolete) — Details — Splinter Review

Attachment #546921 - Attachment is obsolete: true

Attachment #546960 - Flags: review?(jonas)

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 6

•

14 years ago

Comment on attachment 546960 [details] [diff] [review] Corrected the problems Review of attachment 546960 [details] [diff] [review]: ----------------------------------------------------------------- Looks good

Attachment #546960 - Flags: review?(jonas) → review+

Mounir Lamouri (:mounir)

Comment 7

•

14 years ago

Can we have some tests?

Shawn Gong

Assignee

Comment 8

•

14 years ago

(In reply to comment #7) > Can we have some tests? Definitely. I will put tests soon.

Shawn Gong

Assignee

Comment 9

•

14 years ago

(In reply to comment #7) > Can we have some tests? Hey Mounir, it turned out that creating new C++ unit tests with nsString is troublesome, so I will test the tokenizer's results in JavaScript in the related bug 668680.

Mounir Lamouri (:mounir)

Comment 10

•

14 years ago

Why is it troublesome? I mean, we should probably open a bug if we can't test that kind of stuff: we should be able to!

:Ms2ger (he/him; ⌚ UTC+1/+2)

Comment 11

•

14 years ago

Comment on attachment 546960 [details] [diff] [review] Corrected the problems >--- /dev/null >+++ b/xpcom/ds/CharTokenizer.h >+#ifndef __CharTokenizer_h >+#define __CharTokenizer_h Please make that mozilla_CharTokenizer_h >+#endif /* __CharTokenizer_h */ >+ >+ And get rid of these empty lines. >diff --git a/xpcom/ds/Makefile.in b/xpcom/ds/Makefile.in >--- a/xpcom/ds/Makefile.in >+++ b/xpcom/ds/Makefile.in >@@ -108,16 +108,17 @@ EXPORTS = \ >+ CharTokenizer.h \ Please add it to EXPORTS_mozilla instead

Shawn Gong

Assignee

Comment 12

•

14 years ago

Attached patch Fixes according to Ms2ger's comments — Details — Splinter Review

Attachment #546960 - Attachment is obsolete: true

Shawn Gong

Assignee

Comment 13

•

14 years ago

(In reply to comment #10) > Why is it troublesome? I mean, we should probably open a bug if we can't > test that kind of stuff: we should be able to! Will look into it later

:Ms2ger (he/him; ⌚ UTC+1/+2)

Updated

•

14 years ago

Assignee: nobody → shawn

Blocks: 668680

OS: Mac OS X → All

Hardware: x86 → All

Version: unspecified → Trunk

Shawn Gong

Assignee

Comment 14

•

14 years ago

Comment on attachment 548290 [details] [diff] [review] Fixes according to Ms2ger's comments Review+ by sicking

Attachment #548290 - Flags: review+

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 15

•

14 years ago

Checked in to inbound: http://hg.mozilla.org/integration/mozilla-inbound/rev/edb32fa29bca Yay!

(no longer active)

Comment 16

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/edb32fa29bca

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla9

Daniel Holbert [:dholbert]

Comment 17

•

13 years ago

(In reply to Shawn Gong from comment #0) > Currently we only have nsCharSeperatedTokenizer and nsWhitespaceTokenizer, > which have their specific use cases. > > The nsCharTokenizer can take any char as a delimiter and tokenize the string > based on that. That sounds exactly like what nsCharSeperatedTokenizer does, aside from that nsCharSeperatedTokenizer does whitespace-skipping (which is configurable, since you can pass in a custom "is whitespace" method, and you could pass in a method that unconditionally returns false to prevent whitespace-skipping). Aside from that, are there any other things that distinguish nsCharTokenizer from nsCharSeperatedTokenizer?

Daniel Holbert [:dholbert]

Comment 18

•

13 years ago

Actually, it looks like CharTokenizer isn't used at all, aside from a chunk of recently-added code in SVGFragmentIdentifier.cpp (as well as in the awaiting-review patch in bug 668680). I suspect both of those could easily use a variant of nsCharSeperatedTokenizer, and this class CharTokenizer class could go away... I'm just worried that having two nearly-identically-named classes with different impls for the same use-case will cause confusion & increased surface for bugs to crop up. Happy to file a new bug on that; I'm just commenting here to check if I'm misunderstanding the use-cases / purpose of CharTokenizer, before doing that.

Nobody; OK to take it and work on it

Updated

•

6 years ago

Component: DOM → DOM: Core & HTML

You need to log in before you can comment on or make changes to this bug.