Closed Bug 286300 Opened 19 years ago Closed 19 years ago

Clean up HTML tags enum and related code

Categories

(Core :: DOM: HTML Parser, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: peterv, Assigned: peterv)

References

Details

Attachments

(2 files, 1 obsolete file)

Spun of from bug 285166.

We probably should fix nsHTMLTags::LookupTag (and thus HTMLAtomTagToId) to not
return eHTMLTag_text.

LookupTag and CaseSensitiveLookupTag need to be consistent in what they return
for unknown tags.

There's a bunch of redundant methods related to the tags enum in nsIDTD that are
already available through the nsParserService.

Fix callers to not use Tag() to check for textnodes (switch to IsContentOfType).

Remove __moz_text and __moz_comment. Make nsXMLCDataSection and nsTextNode
return #cdata-section and #text from Tag().

DeCOMtaminate a bunch of methods in the nsParserService.
Assignee: parser → peterv
Status: NEW → ASSIGNED
Attached patch v1 (obsolete) — Splinter Review
Attachment #178508 - Flags: superreview?(bzbarsky)
Attachment #178508 - Flags: review?(mrbkap)
Attached patch CleanupSplinter Review
Since we're allowing the creation of elements with nodeName "#text", ...
checking the nodeName is not a reliable way to check for textnodes.
Attachment #178510 - Flags: superreview?(jst)
Attachment #178510 - Flags: review?(jst)
Comment on attachment 178510 [details] [diff] [review]
Cleanup

- In nsMsgCompose::TagConvertible():

+    PRInt16 nodeType;
+    rv = node->GetNodeType(&nodeType);
+    if (NS_FAILED(rv))
+      return rv;
+
+    if (nodeType == nsIDOMNode::TEXT_NODE)
+    {
+      *_retval = nsIMsgCompConvertible::Plain;
+      return NS_OK;
+    }
+
     nsAutoString element;
     rv = node->GetNodeName(element);
     if (NS_FAILED(rv))
@@ -4246,7 +4257,6 @@ nsresult nsMsgCompose::TagConvertible(ns

     nsCOMPtr<nsIDOMNode> pItem;
     if      (
-	       element.LowerCaseEqualsLiteral("#text") ||
	       element.LowerCaseEqualsLiteral("br") ||
...

In stead of another early return for the text node case, you could put the
nodetype check where the #text string compare was, to keep things grouped the
way they were...

Either way, r+sr=jst
Attachment #178510 - Flags: superreview?(jst)
Attachment #178510 - Flags: superreview+
Attachment #178510 - Flags: review?(jst)
Attachment #178510 - Flags: review+
Comment on attachment 178508 [details] [diff] [review]
v1

>Index: content/base/src/mozSanitizingSerializer.cpp
>@@ -766,31 +761,27 @@ mozSanitizingHTMLSerializer::ParseTagPre

>+  PRInt32 bracket = tagpref.FindChar('(');
>+  NS_ConvertUTF8toUTF16 tag(tagpref.get(), bracket);

If bracket == -1, this will do very very bad things (like try to allocate
PR_UINT32_MAX bytes of memory).  So this change is wrong.

Also, could you #ifdef DEBUG the printfs or file a followup bug on that or
something?

As for the rest....

>Index: parser/htmlparser/public/nsIParserService.h

>+  virtual PRInt32 HTMLAtomTagToId(nsIAtom* aAtom) const = 0;
>+
>+  virtual PRInt32 HTMLCaseSensitiveAtomTagToId(nsIAtom* aAtom) 
>+  virtual PRInt32 HTMLStringTagToId(const nsAString& aTag) const = 0;
>+  virtual const PRUnichar *HTMLIdToStringTag(PRInt32 aId) const = 0;
>+  virtual nsIAtom *HTMLIdToAtomTag(PRInt32 aId) const = 0;

Could you please add comments documenting the exact behavior of these methods? 
Once you do that, I can try to actually review this patch; as things stand I
can't tell whether the callers (or the implementations for that matter) are
correct or not.
Attachment #178508 - Flags: superreview?(bzbarsky) → superreview-
Attached patch v1.1Splinter Review
Attachment #178508 - Attachment is obsolete: true
Attachment #180952 - Flags: superreview?(bzbarsky)
Attachment #180952 - Flags: review?(mrbkap)
Attachment #178508 - Flags: review?(mrbkap)
Comment on attachment 180952 [details] [diff] [review]
v1.1

>   // Parsing tag
>-  PRInt32 bracket = tagpref.Find("(");
>-  nsCAutoString tag = tagpref;
>-  if (bracket != kNotFound)
>-    tag.Truncate(bracket);
>-  if (tag.Equals(""))
>+  PRInt32 bracket = tagpref.FindChar('(');
>+  if (bracket == 0)

Don't you need to check against kNotFound also?

Other than that, this looks good, so marking r=mrbkap
Attachment #180952 - Flags: review?(mrbkap) → review+
Comment on attachment 180952 [details] [diff] [review]
v1.1

>Index: content/base/src/mozSanitizingSerializer.cpp

>@@ -474,8 +471,7 @@ mozSanitizingHTMLSerializer::DoOpenConta

>+    const PRUnichar* tag_name = parserService->HTMLIdToStringTag(aTag);
>     NS_ENSURE_TRUE(tag_name, NS_ERROR_INVALID_POINTER);

Won't this make DoOpenContainer throw on userdefined tags?  Is that really
desired here?  I guess that's what the old code did too, eh?

>@@ -766,31 +761,29 @@ mozSanitizingHTMLSerializer::ParseTagPre

>+  nsAutoString tag;
>+  CopyUTF8toUTF16(StringHead(tagpref, bracket), tag);

This has exactly the same problem as the first change you made to this code. 
The second arg to the StringHead method is a PRUint32.	So this will try to
read 4GB of data from |tagpref|.

>-    printf(" unknown tag <%s>, won't add.\n", tag.get());
>+    printf(" unknown tag <%s>, won't add.\n",
>+           NS_ConvertUTF16toUTF8(tag).get());

I'm assuming you're filing a followup bug on nuking these in release builds,
right?

sr=bzbarsky with that bracket issue addressed.	You're just going to need to
explicitly test for kNotFound and do something reasonable there.
Attachment #180952 - Flags: superreview?(bzbarsky) → superreview+
nsTDependentSubstring::Rebind does |mLength = NS_MIN(length, strLength -
startPos);| (length being the argument and strLength being the actual length of
the string) and nsTDependentSubstring's constructor is defined as 
|nsTDependentSubstring_CharT( const substring_type& str, PRUint32 startPos,
PRUint32 length = size_type(-1) )|, so I think it should be ok now (StringHead
does |return nsTDependentSubstring_CharT(str, 0, count);|).
Oh, ok.  Looks good then.
(In reply to comment #7)
> Won't this make DoOpenContainer throw on userdefined tags?  Is that really
> desired here?  I guess that's what the old code did too, eh?

Yeah, it seems wrong but that's why it scares me to change it. Who knows what'll
fall out of the closet.

> >-    printf(" unknown tag <%s>, won't add.\n", tag.get());
> >+    printf(" unknown tag <%s>, won't add.\n",
> >+           NS_ConvertUTF16toUTF8(tag).get());
> 
> I'm assuming you're filing a followup bug on nuking these in release builds,
> right?

See
http://lxr.mozilla.org/seamonkey/source/content/base/src/mozSanitizingSerializer.cpp#71
:-(. Filed bug 291612.
Attachment #178510 - Flags: approval1.8b3?
Comment on attachment 180952 [details] [diff] [review]
v1.1

DeCOMtamination and cleanup of parser code. Should be fairly low-risk.
Attachment #180952 - Flags: approval1.8b3?
Attachment #178510 - Flags: approval1.8b3? → approval1.8b3+
Attachment #180952 - Flags: approval1.8b3? → approval1.8b3+
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Blocks: 299343
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: