Open Bug 794554 Opened 12 years ago Updated 2 years ago

Invalid characters in MathML 'scriptlevel' attribute values are ignored.

Categories

(Core :: MathML, defect)

x86_64
Linux
defect

Tracking

()

People

(Reporter: rfw2nd, Unassigned)

Details

Attachments

(1 file)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:15.0) Gecko/20100101 Firefox/15.0.1
Build ID: 20120907231657

Steps to reproduce:

Created some bad MathML markup in an HTML file, and opened it with firefox.
  <!-- Variant 6:  Bad scriptlevel. -->
  <math scriptlevel="2f2">
    <mi>6</mi>
  </math><BR/>
  <!-- Relative Scriptlevel -->
   <math scriptlevel="+1">
    <mi>R</mi>    
   </math><BR/>
  <!--Variant 7: Really bad scriptlevel. -->
   <math scriptlevel="+ghijklmnopqrstwxyz3">
    <mi>7</mi>
   </math><BR/>
   <!--Variant 8: Bad relative -->
   <math scriptlevel="+-*#$&#(*&$*(3">
    <mi>8</mi>
   </math>



Actual results:

The MathML was rendered as if the invalid characters had never existed.

Variant 6 is the only one handled correctly.  Letters after f and special characters are ignored.  Happened in trunk & firefox 15.


Expected results:

The scriptlevel attribute should have been ignored, as in Variant 6, because the values are filled with many invalid characters.
Component: Untriaged → MathML
Product: Firefox → Core
scriptlevel is parsed here:

http://mxr.mozilla.org/mozilla-central/source/content/mathml/content/src/nsMathMLElement.cpp#443

I'm wondering how ToInteger handles the values you mention.
Looks like the code path goes through to PR_sscanf:

and this also modifies it:
  NS_ConvertUTF16toUTF8 narrow(*this);

PR_sscanf(narrow.get(), fmt, &result)  which seems to be the source of the problem


Maybe we should make the code use ParseNumericValue?  Rather than modify nsString::toInteger which could break things somewhere else?
I think we will have similar issues with scriptsizemultiplier and (on the layout side) maction@selection, mtable@align, mtable@rowspan, mtable@colspan. I haven't checked in details if there are more. 

ParseNumericValue parses length attribute and can be configured a bit with some flags PARSE_ALLOW_UNITLESS and PARSE_ALLOW_NEGATIVE. So if we want to use it to parse integers or floats without allowing for example namedspace or units, that will need some adaptation. We can also add separate parsing functions.

Here are the syntaxes mentioned in the spec:
http://www.w3.org/TR/MathML/chapter2.html#id.2.1.5.1
Status: UNCONFIRMED → NEW
Ever confirmed: true
ParseNumericValue scans the whole string, before sending it off to nsString::ToFloat() for parsing.  Curiously, it seems to check for the condition of more than one dot in the numeric value, something which technically should actually be within the scope of nsString::ToFloat().

We could do the same thing here, just check for invalid characters before parsing. Performance-wise it amounts to extra O(n) operations, but I don't see an issue since we're already dealing with that every time we call ParseNumericValue.

Since it's a problem spread across multiple attributes, we could just define a CheckNumericValue function that will return false when it hits a character that shouldn't be allowed.  Maybe a couple of flags CHECK_INT, CHECK_FLOAT to specify whether to allow things like dots.
(In reply to rfw2nd from comment #4)
> Since it's a problem spread across multiple attributes, we could just define
> a CheckNumericValue function that will return false when it hits a character
> that shouldn't be allowed.  Maybe a couple of flags CHECK_INT, CHECK_FLOAT
> to specify whether to allow things like dots.

(ParseNumericValue should really be called ParseLength now that it matches the MathML's "length" definition...) What we want here is a helper function to parse "unsigned-integer", "positive-integer", "integer", "unsigned-number" and "number" (we need to check precisely what our code base is using). That could be a continuation of bug 553917 i.e. we could add ReportParseError in this function.
Could this be lumped in with some other changes too?  I was going to see about doing a refactor of nsMathMLElement to extract some parsing methods from MapMathMLAttributesInto:

ParseColorValue (which delegates to nsAttrValue)
ParseFontSizeValue ("large"|"small"|etc...)

Perhaps that refactor can integrate these changes as well:

ParseNumericValue --> ParseLengthValue
and the creation of the new ParseNumberValue & ParseIntValue to handle the parsing of numbers.  

Since we have so many subtypes, for Unsigned Number:  PARSE_ALLOW_NEGATIVE can be used.
I noticed that the scriptlevel parsing block has some "cheesy" (in the words of the comment, not mine) code that sets the value to a float or an integer depending upon whether it is a relative value (has +/- in the beginning) or a specific value.  If this can be made into convention, we may have one additional method or flag for those situations.
(In reply to rfw2nd from comment #6)
> Could this be lumped in with some other changes too?

Sometimes we do that, but see here for the general rule:

https://developer.mozilla.org/en-US/docs/What_to_do_and_what_not_to_do_in_Bugzilla#Changing_the_bug_information_fields

It seems that you are proposing a lot of changes different to the initial bug report, so maybe you should open a new bug with a clear description of the changes. Perhaps it's best to wait for bug 553917 to be fixed before doing that. Also, note the work in bug 411227.

BTW, the page mentioned above describes how to get Bugzilla's editbug privilege if you need them:

https://developer.mozilla.org/en-US/docs/What_to_do_and_what_not_to_do_in_Bugzilla#Getting.2FUpgrading_Bugzilla_privileges
Trying to track this bug down is proving to be tough!  I started from ToInteger-- just looking it over; it seems that everything gets delegated to PR_sscanf, which in turn gets delegated to Convert.. then to GetInt... 

The exact behavior is that it ignores all of the invalid input, and converts the first number, then everything else is ignored.  

I noticed that PR_sscanf isn't tested, so I'll make some unit tests.
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: