236476 - Change nsIHTMLContent::GetHTMLAttribute to nsIContent::GetParsedAttr

Jonas Sicking (:sicking) No longer reading bugmail consistently

Assignee

Description

•

22 years ago

First off we should stop using nsHTMLValue (in favour of nsAttrValue) which means that we have to change GetHTMLAttribute. Second GetHTMLAttribute lives in nsIHTMLContent which I would like to kill to avoid all the extra QI-ing it causes. So I suggest a signature like: PRBool GetParsedAttr(PRInt32 aNamespaceID, nsIAtom* aName, nsAttrValue& aResult); or nsAttrValue* GetParsedAttr(PRInt32 aNamespaceID, nsIAtom* aName); While the latter should be faster, it also forces all elements to store their attribute-values as nsAttrValues. While I can't think of any elements that couldn't use nsAttrValues, it shouldn't be that much slower to clone the nsAttrValue. At least once nsAttrValue uses sharable stringbuffers.

Boris Zbarsky [:bzbarsky]

Comment 1

•

22 years ago

Who are the users of this, really? Cloning an nsIFoo* or nsCOMArray<nsIFoo>* is not all that cheap... (refcounting, etc).

Jonas Sicking (:sicking) No longer reading bugmail consistently

Assignee

Comment 2

•

22 years ago

There are three groups of users: * nsFormSubmission. This could/should probably be converted to using GetAttr * The elements themselfs. This can be changed to using mAttrAndChildArray directly, or we could have an internal method in nsGenericElement. * Layout. This is the tricky one, and probably the one that will force us to keep GetHTMLAttribute/GetParsedAttr around. Most likly mozilla/layout/svg could have use for it in the future too, so it's not only mozilla/layout/html.

URL: http://lxr.mozilla.org/mozilla/ident?...

Jonas Sicking (:sicking) No longer reading bugmail consistently

Assignee

Updated

•

22 years ago

Blocks: 232706

Boris Zbarsky [:bzbarsky]

Comment 3

•

22 years ago

OK. As long as we have a fast path for the stuff the style system needs (eg GetClasses)... ;)

Boris Zbarsky [:bzbarsky]

Updated

•

21 years ago

Blocks: 54542

Jonas Sicking (:sicking) No longer reading bugmail consistently

Assignee

Comment 4

•

21 years ago

Attached patch Patch to fix — Details — Splinter Review

This patch does three things: 1. Changes the function GetHTMLAttribute to GetParsedAttr on nsGenericHTMLElement. The new function returns an nsAttrValue* rather then modifying a provided nsHTMLValue. 2. Makes us store enough information in the nsAttrValue so that we can serialize enums. Along with the enum value it also stores an index to the EnumTable that the value came from. That way we have all information needed to serialize. This means that the entire AttributeToString codepath can be removed (se nsGenericHTMLElement::GetAttr). 3. Remove nsHTMLValue since it's no longer used after the two above changes are done. The only bad thing in this patch is that we no longer can use the hackish solution for storing relative fontsizes (i.e. size="-4"). We used to store them as integers flagged as enums so that we could have custom code to convert them to strings in the AttributeToString codepath. The patch makes us store them as 'real' enums by creating an enumtable that contains relative values. The problem with this is that we can't parse just any relative integer, like '-2719'. I just made the enum table go from -10 to 10. I'm not sure if there are pages out there that uses bigger numbers then that. The alternative is to add another type to nsAttrValue. is quite possible, but seems a bit excessive.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Assignee

Updated

•

21 years ago

Assignee: general → bugmail

Status: NEW → ASSIGNED

Jonas Sicking (:sicking) No longer reading bugmail consistently

Assignee

Updated

•

21 years ago

Attachment #171853 - Flags: superreview?(jst)

Attachment #171853 - Flags: review?(bzbarsky)

Vidar Haarr (not reading bugmail)

Comment 5

•

21 years ago

The HTML 4.01 spec says that the size can't be outside of +/-7: <http://www.w3.org/TR/REC-html40/present/graphics.html#edef-FONT> How about, in nsHTMLFontElement::ParseAttribute, we just adjust the input so that it conforms to that? I know pages use larger numbers than 10, though... Just as an example: <http://google.com/search?q=font+size%3D%2220%22>

Jonas Sicking (:sicking) No longer reading bugmail consistently

Assignee

Comment 6

•

21 years ago

Comment on attachment 171853 [details] [diff] [review] Patch to fix I just noticed that we need to support +/-0 as relative fontsizes too. I added those to the enumtable.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Assignee

Comment 7

•

21 years ago

Yes, that is another option. Note though that most of those pages don't set size=20 on a font-tag, but on an input-tag. But I don't think that search is entierly fair since you're not searching the markup, but pages where the markup actually appears as text.

Boris Zbarsky [:bzbarsky]

Comment 8

•

21 years ago

Vidar, we're talking about size="+20", not size="20". They're different beasts. sicking, I'll try to get to this ASAP. This is nice stuff.

Boris Zbarsky [:bzbarsky]

Comment 9

•

21 years ago

Comment on attachment 171853 [details] [diff] [review] Patch to fix >Index: content/base/src/nsAttrValue.cpp >+class nsCheapStringBufferUtils { Note that some changes to this landed today (in particular to the HashCode() function). You'll probably need to move those over too to get this to compile. >@@ -312,13 +425,23 @@ nsAttrValue::ToString(nsAString& aResult >+ EnumTable* table = NS_STATIC_CAST(EnumTable*, sEnumTableArray-> >+ FastElementAt(GetIntInternal() & NS_ATTRVALUE_ENUMTABLEINDEX_MASK)); This screams out for an inline function (maybe called GetEnumTableIndex() or something?). >+ while (table->tag) { >+ if (table->value == val) { >+ CopyASCIItoUCS2(nsDependentCString(table->tag), aResult); Is it worth it to AssignASCII here? That lets you skip the dependent string... And if we're _really_ clever we could compute the string lengths at compile time and store those in the tables and pass them in here. Probably not worth it, though, since all these strings are pretty short. >+ >+ return; >+ } >+ table++; >+ } > > break; Should we assert if we make it out of the loop over the table? If not, we should Truncate() before we break or something (but I think an assert here is reasonable). > nsAttrValue::ParseEnumValue(const nsAString& aValue, >+ EnumTable* tableStart = NS_CONST_CAST(EnumTable*, aTable); Comment that you only need this because nsVoidArray won't deal with const pointers? >+ PRInt32 value = (aTable->value << NS_ATTRVALUE_ENUMTABLEINDEX_BITS) + >+ index; Again, it may be nice to do this in an inline function of some sort, but maybe not... >Index: content/base/src/nsAttrValue.h > nsAttrValue::GetEnumValue() const >+ // We don't need to worry about sign extension here since we're >+ // retuning an PRInt16 which will cut away the top bits. "returning". I'd cast explicitly to PRInt16 in the return statement, actually. Just to make this perfectly clear. >Index: content/html/content/src/nsGenericHTMLElement.cpp > nsGenericHTMLElement::GetDir(nsAString& aDir) So we'll return nonempty strings for dir="something_random" now. Is that a problem? What does IE do? Any reason not to just inline GetParsedAttr() in the header? >Index: content/html/content/src/nsHTMLFontElement.cpp >+static const nsAttrValue::EnumTable kRelFontSizeTable[] = { I do think this is fine as it is (with -0 and +0 added). Not really worth adding an extra type just to deal with people who use super-bogus HTML.... >Index: layout/forms/nsFormControlHelper.cpp > nsFormControlHelper::GetWrapProperty(nsIContent * aContent, Why not Truncate() only in the case when it's not an HTML element? r=bzbarsky with the nits addressed.

Attachment #171853 - Flags: review?(bzbarsky) → review+

Jonas Sicking (:sicking) No longer reading bugmail consistently

Assignee

Comment 10

•

21 years ago

> This screams out for an inline function (maybe called GetEnumTableIndex() or > something?). Not sure if it's worth it. The code needs the tableindex in exactly one place so whether that's implemented in ToString or some inline function is about the same. Most of the "work" is done by the macro anyway. > Is it worth it to AssignASCII here? That lets you skip the dependent > string... Done > And if we're _really_ clever we could compute the string lengths at compile > time and store those in the tables and pass them in here. Probably not worth > it, though, since all these strings are pretty short. Yeah, we could use some macro magic here. Not sure if it's worth it, and in any case i'd rather do it separtly since it'll uglify this patch quite a bit. Ideally I want make some other changes to enum tables so that we won't have to search for the table-index when parsing. > Should we assert if we make it out of the loop over the table? If not, we > should Truncate() before we break or something (but I think an assert here is > reasonable). I had actually already added a NS_NOTREACHED locally. I also added one in ParseEnumValue that right after the call to SetIntValueAndType asserts that GetEnumValue returns the right thing. > > nsGenericHTMLElement::GetDir(nsAString& aDir) > > So we'll return nonempty strings for dir="something_random" now. Is that a > problem? What does IE do? That's what we do for all other enumerated attributes. And I just check with IE and that's what it does too. > Any reason not to just inline GetParsedAttr() in the header? Yay! Excellent idea. MMmm.. removing functioncall overhead... > I do think this is fine as it is (with -0 and +0 added). Not really worth > adding an extra type just to deal with people who use super-bogus HTML.... Ok. If people file bugs we can always fix it then. Fixed everything else locally.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Assignee

Comment 11

•

21 years ago

Oh, now I see, the question is if it's ok to return "random value" from GetDir as opposed to GetAttribute("dir"). It would make sense not to I guess, for a lot of GetFoo attributes we return the actual value rather then what's in the markup. And IE returns emptystring too. I'll revert that change.

Johnny Stenback (:jst)

Comment 12

•

21 years ago

Comment on attachment 171853 [details] [diff] [review] Patch to fix sr=jst

Attachment #171853 - Flags: superreview?(jst) → superreview+

Jonas Sicking (:sicking) No longer reading bugmail consistently

Assignee

Comment 13

•

21 years ago

Check in, thanks for reviews. I'll file a separate bug on trying to make the enumtables more efficient.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Assignee

Comment 14

•

21 years ago

yeah, and mark fixed as well :)

Status: ASSIGNED → RESOLVED

Closed: 21 years ago

Resolution: --- → FIXED

Martijn Wargers (dead)

Updated

•

20 years ago

Depends on: 314343

Nobody; OK to take it and work on it

Updated

•

7 years ago

Component: DOM → DOM: Core & HTML