If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

[HTML5] Make HTML5 parser internals not hold nsIContent or regular dynamic atoms

RESOLVED FIXED

Status

()

Core
HTML: Parser
RESOLVED FIXED
8 years ago
8 years ago

People

(Reporter: hsivonen, Assigned: hsivonen)

Tracking

(Blocks: 1 bug)

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 6 obsolete attachments)

(Assignee)

Description

8 years ago
The tokenizer and tree builder must not hold main-thread-only objects like nsIContent and nsAtomImpl.
(Assignee)

Updated

8 years ago
Blocks: 515941
(Assignee)

Comment 1

8 years ago
Created attachment 400031 [details] [diff] [review]
Change the memory management regime for atoms, nodes and attributes
(Assignee)

Comment 2

8 years ago
Created attachment 400756 [details] [diff] [review]
Change the memory management regime for atoms, nodes and attributes, v2
Attachment #400031 - Attachment is obsolete: true
(Assignee)

Updated

8 years ago
Attachment #400756 - Flags: review?(bnewman)
(Assignee)

Comment 3

8 years ago
Comment on attachment 400756 [details] [diff] [review]
Change the memory management regime for atoms, nodes and attributes, v2

This is ready for review, except changes over in bug 514661 will affect how exactly this patch will call the scoped atom table stuff. However, those changes shouldn't affect the substance of this patch.
Does this patch depend on any others?  It's not applying cleanly for me, even on top of 514661.  Maybe I'm just confused.
Keywords: 4xp
(Assignee)

Comment 5

8 years ago
It depends on bug 499642, bug 514661 and bug 515142. Bug 515142 is a spec correctness change to the Java bits, so I'm assuming it can go in as rs=sicking.
Depends on: 499642, 515142
My comments:

diff --git a/parser/html/nsHtml5Portability.cpp b/parser/html/nsHtml5Portability.cpp
--- a/parser/html/nsHtml5Portability.cpp
+++ b/parser/html/nsHtml5Portability.cpp
@@ -115,20 +115,8 @@ nsHtml5Portability::releaseLocal(nsIAtom
 {
   NS_IF_RELEASE(local);
 }
 
-// XXX Useless code
-void
-nsHtml5Portability::retainElement(nsIContent** element)
-{
-}
-
-// XXX Useless code
-void
-nsHtml5Portability::releaseElement(nsIContent** element)
-{
-}
-
 PRBool
 nsHtml5Portability::localEqualsBuffer(nsIAtom* local, PRUnichar* buf, PRInt32 offset, PRInt32 length)
 {
   return local->Equals(nsDependentSubstring(buf + offset, buf + offset + length));
diff --git a/parser/html/nsHtml5Portability.h b/parser/html/nsHtml5Portability.h
--- a/parser/html/nsHtml5Portability.h
+++ b/parser/html/nsHtml5Portability.h
@@ -68,10 +68,11 @@ class nsHtml5Portability
     static jArray<PRUnichar,PRInt32> newCharArrayFromString(nsString* string);
     static void releaseString(nsString* str);
     static void retainLocal(nsIAtom* local);
     static void releaseLocal(nsIAtom* local);
-    static void retainElement(nsIContent** elt);
-    static void releaseElement(nsIContent** elt);
+    // Advertise the uselessness here?
+    static void retainElement(nsIContent** elt) {}
+    static void releaseElement(nsIContent** elt) {}
     static PRBool localEqualsBuffer(nsIAtom* local, PRUnichar* buf, PRInt32 offset, PRInt32 length);
     static PRBool lowerCaseLiteralIsPrefixOfIgnoreAsciiCaseString(const char* lowerCaseLiteral, nsString* string);
     static PRBool lowerCaseLiteralEqualsIgnoreAsciiCaseString(const char* lowerCaseLiteral, nsString* string);
     static PRBool literalEqualsString(const char* literal, nsString* string);
diff --git a/parser/html/nsHtml5TreeBuilder.cpp b/parser/html/nsHtml5TreeBuilder.cpp
--- a/parser/html/nsHtml5TreeBuilder.cpp
+++ b/parser/html/nsHtml5TreeBuilder.cpp
@@ -617,8 +617,9 @@ nsHtml5TreeBuilder::startTag(nsHtml5Elem
                   selfClosing = PR_FALSE;
                 } else {
                   appendToCurrentNodeAndPushElementMayFosterCamelCase(currNs, elementName, attributes);
                 }
+                // Why do these assignments not use nsHtml5HtmlAttributes::EMPTY_ATTRIBUTES?
                 attributes = nsnull;
                 goto starttagloop_end;
               } else {
                 attributes->adjustForMath();
diff --git a/parser/html/nsHtml5TreeOpExecutor.cpp b/parser/html/nsHtml5TreeOpExecutor.cpp
--- a/parser/html/nsHtml5TreeOpExecutor.cpp
+++ b/parser/html/nsHtml5TreeOpExecutor.cpp
@@ -75,8 +75,10 @@ NS_IMPL_RELEASE_INHERITED(nsHtml5TreeOpE
 
 NS_IMPL_CYCLE_COLLECTION_TRAVERSE_BEGIN_INHERITED(nsHtml5TreeOpExecutor, nsContentSink)
   NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mFlushTimer)
   NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mScriptElement)
+  // This is gonna traverse a lot of elements.  Maybe there's no other way, but
+  // I hope you're pretty sure this is necessary.
   NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMARRAY(mOwnedElements)
   NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMARRAY(mOwnedNonElements)
 NS_IMPL_CYCLE_COLLECTION_TRAVERSE_END
 
diff --git a/parser/html/nsHtml5TreeOpExecutor.h b/parser/html/nsHtml5TreeOpExecutor.h
--- a/parser/html/nsHtml5TreeOpExecutor.h
+++ b/parser/html/nsHtml5TreeOpExecutor.h
@@ -112,8 +112,12 @@ class nsHtml5TreeOpExecutor : public nsI
     nsTArray<nsHtml5TreeOperation>       mOpQueue;
     nsTArray<nsIContentPtr>              mElementsSeenInThisAppendBatch;
     nsTArray<nsHtml5PendingNotification> mPendingNotifications;
     nsHtml5StreamParser*                 mStreamParser;
+
+    // Any idea how often duplicate elements are inserted?
+    // From a brief glance at debugging spew, I didn't notice any obvious
+    // repetition, so the hash doesn't seem immediately necessary.
     nsCOMArray<nsIContent>               mOwnedElements; // maybe change to hash
     
     // This could be optimized away by introducing more tree ops so that 
     // non-elements wouldn't use the handle setup but the text node / comment
diff --git a/parser/html/nsHtml5TreeOperation.cpp b/parser/html/nsHtml5TreeOperation.cpp
--- a/parser/html/nsHtml5TreeOperation.cpp
+++ b/parser/html/nsHtml5TreeOperation.cpp
@@ -58,15 +58,15 @@
 #include "nsIStyleSheetLinkingElement.h"
 #include "nsIDOMDocumentType.h"
 
 #define TO_CONTENT(_field) \
-  (*((nsIContent**)_field))
+  (*(static_cast<nsIContent**>(_field)))
 
 #define TO_ATOM(_field) \
-  ((nsIAtom*)_field)
+  (static_cast<nsIAtom*>(_field))
 
 #define TO_ATTRIBUTES(_field) \
-  ((nsHtml5HtmlAttributes*)_field)
+  (static_cast<nsHtml5HtmlAttributes*>(_field))
 
 nsHtml5TreeOperation::nsHtml5TreeOperation()
  : mOpCode(eTreeOpAppend)
 {
@@ -188,9 +188,9 @@ nsHtml5TreeOperation::Perform(nsHtml5Tre
               }
             }
           }
           
-          stateMask = stateMask ^ PRUint32(node->IntrinsicState());
+          stateMask ^= PRUint32(node->IntrinsicState());
           if (stateMask && document) {
             MOZ_AUTO_DOC_UPDATE(document, UPDATE_CONTENT_STATE, PR_TRUE);
             document->ContentStatesChanged(node, nsnull, stateMask);
           }
@@ -365,9 +365,9 @@ nsHtml5TreeOperation::Perform(nsHtml5Tre
       aBuilder->StartLayout(); // this causes a flush anyway
       return rv;
     }
     case eTreeOpDocumentMode: {
-      aBuilder->DocumentMode(*((nsHtml5DocumentMode*)mPtrOne));
+      aBuilder->DocumentMode(*static_cast<nsHtml5DocumentMode*>(mPtrOne));
       return rv;
     }
     default: {
       NS_NOTREACHED("Bogus tree op");
diff --git a/parser/html/nsHtml5TreeOperation.h b/parser/html/nsHtml5TreeOperation.h
--- a/parser/html/nsHtml5TreeOperation.h
+++ b/parser/html/nsHtml5TreeOperation.h
@@ -99,8 +99,9 @@ class nsHtml5TreeOperation {
 
     ~nsHtml5TreeOperation();
 
     inline void Init(nsIContent** aNode, nsIContent** aParent) {
+      // Why abandon the old meaningful variable names (mNode, mParent)?
       mPtrOne = (void*)aNode;
       mPtrTwo = (void*)aParent;
     }
 
@@ -130,8 +131,9 @@ class nsHtml5TreeOperation {
     inline void Init(nsHtml5DocumentMode aMode) {
       mOpCode = eTreeOpDocumentMode;
       switch (aMode) {
         case QUIRKS_MODE:
+          // Oh, I see.
           mPtrOne = (void*) &sQuirks;
           return;
         case ALMOST_STANDARDS_MODE:
           mPtrOne = (void*) &sAlmostStandards;
@@ -201,8 +203,9 @@ class nsHtml5TreeOperation {
     // 3) Make the queue take items the size of pointer and make the op code
     //    decide how many operands it dequeues after it.
     eHtml5TreeOperation mOpCode;
     PRInt32 mInt;
+    // Can you at least make these union types?
     void* mPtrOne;
     void* mPtrTwo;
     void* mPtrThree;
 };
Created attachment 401586 [details] [diff] [review]
Example of how nsHtml5TreeOperation::mPtrOne could be turned into a type-safe(r) union.

This trick essentially gives the value stored in mPtrOne a bunch of typed aliases.  As each name corresponds to a specific type, the need for void* casting goes away.

I still think having multiple members (m1, m2, m3) is useful, although technically you could have union members like struct { nsIContent **node, **parent }.  Since the options provided by m1..3 will probably be distinct, the division ensures somewhat more type safety than having a single union type.
(Assignee)

Comment 8

8 years ago
Created attachment 401834 [details] [diff] [review]
Address review comments, match new patch for bug 514661

> -// XXX Useless code
> -void
> -nsHtml5Portability::retainElement(nsIContent** element)
> -{
> -}
> -
> -// XXX Useless code
> -void
> -nsHtml5Portability::releaseElement(nsIContent** element)
> -{
> -}

I've now moved the ugliness to the Java source and the translator. I didn't remove these from the Java source, yet, because it seemed premature to remove the possibility of element refcounting C++ targets for non-Gecko uses.

However, making the translator change take effect with this patch would be inconvenient, because by now, the translator also has other changes for the next iteration and those would break this patch.

OK if those useless methods go away with the next landing after this one?

> diff --git a/parser/html/nsHtml5TreeBuilder.cpp
> b/parser/html/nsHtml5TreeBuilder.cpp
> --- a/parser/html/nsHtml5TreeBuilder.cpp
> +++ b/parser/html/nsHtml5TreeBuilder.cpp
> @@ -617,8 +617,9 @@ nsHtml5TreeBuilder::startTag(nsHtml5Elem
>                    selfClosing = PR_FALSE;
>                  } else {
>                    appendToCurrentNodeAndPushElementMayFosterCamelCase(currNs,
> elementName, attributes);
>                  }
> +                // Why do these assignments not use
> nsHtml5HtmlAttributes::EMPTY_ATTRIBUTES?
>                  attributes = nsnull;

It felt more natural to use null as the marker that the the object's ownership has been adopted away. This is a bit awkward, but the attribute holder will cross the thread boundary and I wanted to avoid situation where an object participates in refcounting on two threads. Also, I wanted to avoid constructs that can't be expressed easily in the Java source (reference to pointer or pointer pointer that'd allow the method doing the adoption write into a stack variable of its invocator).

> diff --git a/parser/html/nsHtml5TreeOpExecutor.cpp
> b/parser/html/nsHtml5TreeOpExecutor.cpp
> --- a/parser/html/nsHtml5TreeOpExecutor.cpp
> +++ b/parser/html/nsHtml5TreeOpExecutor.cpp
> @@ -75,8 +75,10 @@ NS_IMPL_RELEASE_INHERITED(nsHtml5TreeOpE
> 
>  NS_IMPL_CYCLE_COLLECTION_TRAVERSE_BEGIN_INHERITED(nsHtml5TreeOpExecutor,
> nsContentSink)
>    NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mFlushTimer)
>    NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mScriptElement)
> +  // This is gonna traverse a lot of elements.  Maybe there's no other way,
> but
> +  // I hope you're pretty sure this is necessary.
>    NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMARRAY(mOwnedElements)
>    NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMARRAY(mOwnedNonElements)
>  NS_IMPL_CYCLE_COLLECTION_TRAVERSE_END

It's necessary if one wants the cycle collector to be able to correctly traverse the whole object graph. It's rather useless if it's useless to have the cycle collector fully traverse the object graph during the parse, though.

Therefore, I'm not at all sure that this is necessary. I suspect it's unnecessary, but I'm not 100% sure it's unnecessary. That is, I don't know if there are cases where the parser stops in mid-parse and the cycle collector is relied upon for detecting the situation and collecting the whole graph.

> diff --git a/parser/html/nsHtml5TreeOpExecutor.h
> b/parser/html/nsHtml5TreeOpExecutor.h
> --- a/parser/html/nsHtml5TreeOpExecutor.h
> +++ b/parser/html/nsHtml5TreeOpExecutor.h
> @@ -112,8 +112,12 @@ class nsHtml5TreeOpExecutor : public nsI
>      nsTArray<nsHtml5TreeOperation>       mOpQueue;
>      nsTArray<nsIContentPtr>              mElementsSeenInThisAppendBatch;
>      nsTArray<nsHtml5PendingNotification> mPendingNotifications;
>      nsHtml5StreamParser*                 mStreamParser;
> +
> +    // Any idea how often duplicate elements are inserted?
> +    // From a brief glance at debugging spew, I didn't notice any obvious
> +    // repetition, so the hash doesn't seem immediately necessary.
>      nsCOMArray<nsIContent>               mOwnedElements; // maybe change to
> hash

What I had in mind was trying to deref the nodes earlier in the future. In such a scenario, it would also have to be possible to addref the "form pointer" element. I've removed the comment.

> -  (*((nsIContent**)_field))
> +  (*(static_cast<nsIContent**>(_field)))

Casting replaced with use of unions.

> -          stateMask = stateMask ^ PRUint32(node->IntrinsicState());
> +          stateMask ^= PRUint32(node->IntrinsicState());

Fixed.

> +      // Why abandon the old meaningful variable names (mNode, mParent)?

Because many tree ops use the members for other purposes.

> +    // Can you at least make these union types?

Made them into unions.

> I still think having multiple members (m1, m2, m3) is useful, although
> technically you could have union members like struct { nsIContent **node,
> **parent }.  Since the options provided by m1..3 will probably be distinct, the
> division ensures somewhat more type safety than having a single union type.

Type safety is up to the programmer here anyway, so I went with a single union type for the three members.

Also changed the atom stuff to use the new patch for bug 514661.
Attachment #400756 - Attachment is obsolete: true
Attachment #401834 - Flags: review?(bnewman)
Attachment #400756 - Flags: review?(bnewman)
(Assignee)

Comment 9

8 years ago
Created attachment 401876 [details] [diff] [review]
Avoid leaks when tree ops are discarded without executing them
Attachment #401834 - Attachment is obsolete: true
Attachment #401876 - Flags: review?(bnewman)
Attachment #401834 - Flags: review?(bnewman)
(Assignee)

Comment 10

8 years ago
Actually, avoiding leaks by using the destructor is wrong, because nsTArray copies its contents upon append. The best fix would be having a way to append an nsTArray to another by moving so that array items would neither be copy-constructed nor destructed by the append.
(Assignee)

Comment 11

8 years ago
It seems the AppendElements() stuff wasn't in this patch but in the next one in my queue, so the copy-construction case doesn't arise yet in this patch.
(Assignee)

Updated

8 years ago
Blocks: 516406
(Assignee)

Comment 12

8 years ago
Created attachment 402824 [details] [diff] [review]
Clean up compiler warnings
Attachment #401876 - Attachment is obsolete: true
Attachment #402824 - Flags: review?(bnewman)
Attachment #401876 - Flags: review?(bnewman)
Created attachment 402962 [details] [diff] [review]
feedback for "Avoid leaks when tree ops are discarded without executing them
Comment on attachment 402962 [details] [diff] [review]
feedback for "Avoid leaks when tree ops are discarded without executing them

(In reply to comment #13)
> Created an attachment (id=402962) [details]
> feedback for "Avoid leaks when tree ops are discarded without executing them

Moving the union type into nsHtml5TreeOperation, using nsAutoPtr and nsAutoArrayPtr instead of the ~nsHtml5TreeOperation (seems cleaner to me), and const-ifying the nsHtml5Parser::mAtomTable member (since it never gets reassigned).
Attachment #402962 - Attachment is patch: true
Attachment #402962 - Attachment mime type: application/octet-stream → text/plain
(Assignee)

Comment 15

8 years ago
(In reply to comment #14)
> (From update of attachment 402962 [details] [diff] [review])
> (In reply to comment #13)
> > Created an attachment (id=402962) [details] [details]
> > feedback for "Avoid leaks when tree ops are discarded without executing them
> 
> Moving the union type into nsHtml5TreeOperation, using nsAutoPtr and
> nsAutoArrayPtr instead of the ~nsHtml5TreeOperation (seems cleaner to me), 

Won't this fail when we have speculation and tree ops in failed speculations get destroyed without ever calling Perform() them? (Even now forced Teminated() should lead to that situation.)

> and
> const-ifying the nsHtml5Parser::mAtomTable member (since it never gets
> reassigned).

Never reassiging is a bug, actually. It needs to get reassigned or otherwise emptied when the innerHTML parser is recycled.
(Assignee)

Comment 16

8 years ago
Created attachment 403779 [details] [diff] [review]
Address review comments

I made the other changes except I didn't move the tree op member deletions due to concerns stated above.
Attachment #402824 - Attachment is obsolete: true
Attachment #402824 - Flags: review?(bnewman)
(Assignee)

Comment 17

8 years ago
This step along the way introduced string leaks. Must be a failure to delete an attribute holder somewhere.
(Assignee)

Comment 18

8 years ago
Bug 518104 is now in the queue ahead of this one, since I had to put one or the other first.
Depends on: 518104
(Assignee)

Comment 19

8 years ago
Created attachment 404803 [details] [diff] [review]
Fix memory leaks
Attachment #403779 - Attachment is obsolete: true
Attachment #404803 - Flags: review?(bnewman)
Comment on attachment 404803 [details] [diff] [review]
Fix memory leaks

Looks good to me.
Attachment #404803 - Flags: review?(bnewman) → review+
(Assignee)

Updated

8 years ago
Blocks: 522512
(Assignee)

Updated

8 years ago
Status: NEW → ASSIGNED
(Assignee)

Updated

8 years ago
Blocks: 515278
(Assignee)

Comment 21

8 years ago
Thank you for the review.

Pushed as http://hg.mozilla.org/mozilla-central/rev/34bc9a8eb49e
Status: ASSIGNED → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.