Open Bug 565031 Opened 14 years ago Updated 2 years ago

Make is_space() in nsImageMap.cpp HTML5-compliant

Categories

(Core :: DOM: Core & HTML, defect)

defect

Tracking

()

People

(Reporter: hsivonen, Unassigned)

References

()

Details

(Keywords: html5, student-project)

Attachments

(1 file, 1 obsolete file)

is_space() in nsImageMap.cpp accepts characters other than ' ' but shouldn't per HTML5.
http://mxr-test.konigsberg.mozilla.org/mozilla-central/source/layout/generic/nsImageMap.cpp#110
http://www.whatwg.org/specs/web-apps/current-work/#rules-for-parsing-a-list-of-integers

Either the code should change to match HTML5 or the HTML5 spec should change and the code then be changed to match the changed spec.
This code should probably be changed to use nsCharSeparatedTokenizerTemplate<> (with nsContentUtils::IsHTMLWhitespace)
I'm going to do this as a lab with my students this term, so I'll assign to me until one of them has a patch ready.
Assignee: nobody → david.humphrey
Status: NEW → ASSIGNED
Attached patch Fix + test (obsolete) — Splinter Review
This patch changes the coords parser so that it only accepts spaces, commas, and semicolons as per the spec.

Also while fixing this bug, we noticed that letters should be allowed and ignored, which we currently dont do.  This patch doesn't address that, im not sure if you wanted another bug for that.
Attachment #525432 - Flags: review?(hsivonen)
Moving from David to David :)
Assignee: david.humphrey → david.c.seifried
Comment on attachment 525432 [details] [diff] [review]
Fix + test

> inline PRBool
> is_space(char c)
> {
>-  return (c == ' ' ||
>-          c == '\f' ||
>-          c == '\n' ||
>-          c == '\r' ||
>-          c == '\t' ||
>-          c == '\v');
>+  return c == ' ';
> }

I think it isn't useful to keep around the is_space function when it has degenerated to return c == ' ';. Instead, the invocation sites could check == ' ' directly or the function could be changed to inline PRBool isSeparator(char c) {return c == ' ' || c == ',' || c == ';'; }.

>         if (*tptr == ',')
>         {
>           if (!has_comma)
>           {
>             has_comma = PR_TRUE;
>           }

Why is this special treatment of the comma left in? From http://www.whatwg.org/specs/web-apps/current-work/#rules-for-parsing-a-list-of-integers it seems to me that the space, comma and semicolon characters are always treated equivalently with each other in the algorithm.

The test case looks good to me.

Note that I'm not a module peer for this code, so by the rules, getting review from me doesn't allow you to land. I suggest requesting further review from :bz or :smaug.
Attachment #525432 - Flags: review?(hsivonen) → review-
Attached patch Fix + test 2Splinter Review
Changed it so is_space accepts spaces, commas, and semicolons now, also changed the name of the function to is_seperator.  Also changed it so that commas, spaces, and semicolons should be treated equally.  Requested further review from :bz as suggested.
Attachment #526033 - Flags: review?(bzbarsky)
Henri, you're as well qualified to review this as I am; possibly more so.  If you feel comfortable doing that, I'll gladly delegate the review to you.
Attachment #525432 - Attachment is obsolete: true
Comment on attachment 526033 [details] [diff] [review]
Fix + test 2

(In reply to comment #8)
> Henri, you're as well qualified to review this as I am; possibly more so.  If
> you feel comfortable doing that, I'll gladly delegate the review to you.

OK. I'll handle this and subsequent review rounds with the power delegated to me by bz.

First of all, sorry that I didn't initially review the ParseCoords method as a whole, which leads to more work on your part. Although the bug summary talks about is_space() only, I think it doesn't make sense to leave the code that calls the helper function HTML5-incompliant, so I'll review the entire ParseCoords method as patched, since it's difficult to review just the diffs. I realize that I will end up complaining about pre-existing design decisions, but this is a good chance to fix those.

>void Area::ParseCoords(const nsAString& aSpec)
>{
>  char* cp = ToNewCString(aSpec);

Converting from a UTF-16 string into a C string seems unfortunate. It seems to me that the purpose of the conversion is to be able to call atoi(). I note that nsAString has a method called ToInteger. However, it seems to me that it would be even easier to just implement the algorithm from the spec step-by-step, including doing the integer parsing as part of the algorithm without using either ToInteger() or atoi().

>  if (cp) {

Having the entire method inside an if like this sucks.
if (!cp) {
  return;
}
would be more appropriate but won't be necessary if you work with aSpec directly.

>    if (*cp == '\0')
>    {
>      return;
>    }

This and other early returns seem te leak the C string pointed to by cp, but this won't be a problem if you get rid of cp and work with aSpec directly.

>    /*
>     * Skip beginning whitespace, all whitespace is empty list.
>     */
>    n_str = cp;
>    while (*n_str == ' ')
>    {
>      n_str++;
>    }

This step is not correct per the algorithm in the spec:
http://www.whatwg.org/specs/web-apps/current-work/#rules-for-parsing-a-list-of-integers

Here you should be performing step 4 of the algorithm, which advances on any of space, comma or semicolon.

>    /*
>     * Make a pass where any two numbers separated by just whitespace
>     * are given a comma separator.  Count entries while passing.
>     */

I recommend getting rid of the separator writing appoarch. I believe it's easier to rewrite this whole method by following the spec step-by-step (except using loop structures instead of goto, of course, even though the spec uses goto).

>    /*
>     * Allocate space for the coordinate array.
>     */
>    value_list = new nscoord[cnt];
>    if (!value_list)
>    {
>      return;
>    }

Instead of having one loop for counting the coordinates for allocation purposes and another for parsing, I suggest having one instance of the implementation of the spec's algorithm where you omit assignments to value_list[i] if value_list hasn't been allocated yet and at the end exit if value_list has been allocated or allocate it go back to the start of the algorithm otherwise.
Attachment #526033 - Flags: review?(bzbarsky) → review-
(In reply to comment #9)
> Comment on attachment 526033 [details] [diff] [review]
> >    /*
> >     * Allocate space for the coordinate array.
> >     */
> >    value_list = new nscoord[cnt];
> >    if (!value_list)
> >    {
> >      return;
> >    }

And while you're here, please drop the |if (!value_list)| block, |new| can't fail.
This allocation point should probably use a fallible new call, fwiw.
Thanks for the feedback.  From what I understand, you are suggesting the function be re-written according to the spec?
(In reply to comment #12)
> Thanks for the feedback.  From what I understand, you are suggesting the
> function be re-written according to the spec?

Exactly.
Alright sounds good, ill get on it.
Component: DOM → DOM: Core & HTML

The bug assignee is inactive on Bugzilla, so the assignee is being reset.

Assignee: david.c.seifried → nobody
Status: ASSIGNED → NEW
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: