Closed Bug 236476 Opened 20 years ago Closed 20 years ago

Change nsIHTMLContent::GetHTMLAttribute to nsIContent::GetParsedAttr

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: sicking, Assigned: sicking)

References

()

Details

Attachments

(1 file)

First off we should stop using nsHTMLValue (in favour of nsAttrValue) which
means that we have to change GetHTMLAttribute.
Second GetHTMLAttribute lives in nsIHTMLContent which I would like to kill to
avoid all the extra QI-ing it causes.

So I suggest a signature like:
PRBool GetParsedAttr(PRInt32 aNamespaceID, nsIAtom* aName, nsAttrValue& aResult);
or
nsAttrValue* GetParsedAttr(PRInt32 aNamespaceID, nsIAtom* aName);

While the latter should be faster, it also forces all elements to store their
attribute-values as nsAttrValues.

While I can't think of any elements that couldn't use nsAttrValues, it shouldn't
be that much slower to clone the nsAttrValue. At least once nsAttrValue uses
sharable stringbuffers.
Who are the users of this, really?  Cloning an nsIFoo* or nsCOMArray<nsIFoo>*
is not all that cheap... (refcounting, etc).
There are three groups of users:

* nsFormSubmission. This could/should probably be converted to using GetAttr
* The elements themselfs. This can be changed to using mAttrAndChildArray
  directly, or we could have an internal method in nsGenericElement.
* Layout. This is the tricky one, and probably the one that will force us to
  keep GetHTMLAttribute/GetParsedAttr around. Most likly mozilla/layout/svg could
  have use for it in the future too, so it's not only mozilla/layout/html.
  
OK.  As long as we have a fast path for the stuff the style system needs (eg
GetClasses)... ;)
Blocks: 54542
Attached patch Patch to fixSplinter Review
This patch does three things:

1. Changes the function GetHTMLAttribute to GetParsedAttr on
nsGenericHTMLElement. The new function returns an nsAttrValue* rather then
modifying a provided nsHTMLValue.

2. Makes us store enough information in the nsAttrValue so that we can
serialize enums. Along with the enum value it also stores an index to the
EnumTable that the value came from. That way we have all information needed to
serialize. This means that the entire AttributeToString codepath can be removed
(se nsGenericHTMLElement::GetAttr).

3. Remove nsHTMLValue since it's no longer used after the two above changes are
done.


The only bad thing in this patch is that we no longer can use the hackish
solution for storing relative fontsizes (i.e. size="-4"). We used to store them
as integers flagged as enums so that we could have custom code to convert them
to strings in the AttributeToString codepath.
The patch makes us store them as 'real' enums by creating an enumtable that
contains relative values. The problem with this is that we can't parse just any
relative integer, like '-2719'. I just made the enum table go from -10 to 10.
I'm not sure if there are pages out there that uses bigger numbers then that.
The alternative is to add another type to nsAttrValue. is quite possible, but
seems a bit excessive.
Assignee: general → bugmail
Status: NEW → ASSIGNED
Attachment #171853 - Flags: superreview?(jst)
Attachment #171853 - Flags: review?(bzbarsky)
The HTML 4.01 spec says that the size can't be outside of +/-7:
<http://www.w3.org/TR/REC-html40/present/graphics.html#edef-FONT>
How about, in nsHTMLFontElement::ParseAttribute, we just adjust the input so
that it conforms to that?

I know pages use larger numbers than 10, though... Just as an example:
<http://google.com/search?q=font+size%3D%2220%22>
Comment on attachment 171853 [details] [diff] [review]
Patch to fix

I just noticed that we need to support +/-0 as relative fontsizes too. I added
those to the enumtable.
Yes, that is another option. Note though that most of those pages don't set
size=20 on a font-tag, but on an input-tag. But I don't think that search is
entierly fair since you're not searching the markup, but pages where the markup
actually appears as text.
Vidar, we're talking about size="+20", not size="20".  They're different beasts.

sicking, I'll try to get to this ASAP.  This is nice stuff.
Comment on attachment 171853 [details] [diff] [review]
Patch to fix

>Index: content/base/src/nsAttrValue.cpp

>+class nsCheapStringBufferUtils {

Note that some changes to this landed today (in particular to the HashCode()
function).  You'll probably need to move those over too to get this to compile.

>@@ -312,13 +425,23 @@ nsAttrValue::ToString(nsAString& aResult
>+      EnumTable* table = NS_STATIC_CAST(EnumTable*, sEnumTableArray->
>+          FastElementAt(GetIntInternal() & NS_ATTRVALUE_ENUMTABLEINDEX_MASK));

This screams out for an inline function (maybe called GetEnumTableIndex() or
something?).

>+      while (table->tag) {
>+        if (table->value == val) {
>+          CopyASCIItoUCS2(nsDependentCString(table->tag), aResult);

Is it worth it to AssignASCII here?  That lets you skip the dependent string...
 And if we're _really_ clever we could compute the string lengths at compile
time and store those in the tables and pass them in here.  Probably not worth
it, though, since all these strings are pretty short.

>+          
>+          return;
>+        }
>+        table++;
>+      }
> 
>       break;

Should we assert if we make it out of the loop over the table?	If not, we
should Truncate() before we break or something (but I think an assert here is
reasonable).

> nsAttrValue::ParseEnumValue(const nsAString& aValue,

>+  EnumTable* tableStart = NS_CONST_CAST(EnumTable*, aTable);

Comment that you only need this because nsVoidArray won't deal with const
pointers?

>+      PRInt32 value = (aTable->value << NS_ATTRVALUE_ENUMTABLEINDEX_BITS) +
>+                      index;

Again, it may be nice to do this in an inline function of some sort, but maybe
not...

>Index: content/base/src/nsAttrValue.h
> nsAttrValue::GetEnumValue() const
>+  // We don't need to worry about sign extension here since we're
>+  // retuning an PRInt16 which will cut away the top bits.

"returning".

I'd cast explicitly to PRInt16 in the return statement, actually.  Just to make
this perfectly clear.

>Index: content/html/content/src/nsGenericHTMLElement.cpp

> nsGenericHTMLElement::GetDir(nsAString& aDir)

So we'll return nonempty strings for dir="something_random" now.  Is that a
problem?  What does IE do?

Any reason not to just inline GetParsedAttr() in the header?

>Index: content/html/content/src/nsHTMLFontElement.cpp
>+static const nsAttrValue::EnumTable kRelFontSizeTable[] = {

I do think this is fine as it is (with -0 and +0 added).  Not really worth
adding an extra type just to deal with people who use super-bogus HTML....

>Index: layout/forms/nsFormControlHelper.cpp
> nsFormControlHelper::GetWrapProperty(nsIContent * aContent, 

Why not Truncate() only in the case when it's not an HTML element?

r=bzbarsky with the nits addressed.
Attachment #171853 - Flags: review?(bzbarsky) → review+
> This screams out for an inline function (maybe called GetEnumTableIndex() or
> something?).

Not sure if it's worth it. The code needs the tableindex in exactly one place so
whether that's implemented in ToString or some inline function is about the
same. Most of the "work" is done by the macro anyway.

> Is it worth it to AssignASCII here?  That lets you skip the dependent
> string...

Done

>  And if we're _really_ clever we could compute the string lengths at compile
> time and store those in the tables and pass them in here.  Probably not worth
> it, though, since all these strings are pretty short.

Yeah, we could use some macro magic here. Not sure if it's worth it, and in any
case i'd rather do it separtly since it'll uglify this patch quite a bit.
Ideally I want make some other changes to enum tables so that we won't have to
search for the table-index when parsing.

> Should we assert if we make it out of the loop over the table? If not, we
> should Truncate() before we break or something (but I think an assert here is
> reasonable).

I had actually already added a NS_NOTREACHED locally. I also added one in
ParseEnumValue that right after the call to SetIntValueAndType asserts that
GetEnumValue returns the right thing. 

> > nsGenericHTMLElement::GetDir(nsAString& aDir)
> 
> So we'll return nonempty strings for dir="something_random" now.  Is that a
> problem?  What does IE do?

That's what we do for all other enumerated attributes. And I just check with IE
and that's what it does too.

> Any reason not to just inline GetParsedAttr() in the header?

Yay! Excellent idea. MMmm.. removing functioncall overhead...

> I do think this is fine as it is (with -0 and +0 added).  Not really worth
> adding an extra type just to deal with people who use super-bogus HTML....

Ok. If people file bugs we can always fix it then.

Fixed everything else locally.
Oh, now I see, the question is if it's ok to return "random value" from GetDir
as opposed to GetAttribute("dir"). It would make sense not to I guess, for a lot
of GetFoo attributes we return the actual value rather then what's in the
markup. And IE returns emptystring too. I'll revert that change.
Comment on attachment 171853 [details] [diff] [review]
Patch to fix

sr=jst
Attachment #171853 - Flags: superreview?(jst) → superreview+
Check in, thanks for reviews. I'll file a separate bug on trying to make the
enumtables more efficient.
yeah, and mark fixed as well :)
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Depends on: 314343
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: