Closed Bug 1176681 Opened 5 years ago Closed 5 years ago

Make character buffer allocations in the HTML5 tree builder fallible

Categories

(Core :: DOM: HTML Parser, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla43
Tracking Status
firefox43 --- fixed

People

(Reporter: hsivonen, Assigned: hsivonen)

References

Details

Attachments

(1 file, 3 obsolete files)

No description provided.
Blocks: 1176698
Attachment #8625285 - Attachment is obsolete: true
It's somewhat unfortunate that this patch continues to make tokenizeBuffer() even more special in the sense that you need to remember to check for brokenness after it returns, but the "mark as broken" pattern has already been used successfully in the parser and is way less invasive that trying to propagate errors along the call stack.
Comment on attachment 8625327 [details] [diff] [review]
Use fallible allocation for buffers copies, v2

See bug 1029671 comment 36 for the big picture of the whole queue and about landing.
Attachment #8625327 - Flags: review?(wchen)
Comment on attachment 8625327 [details] [diff] [review]
Use fallible allocation for buffers copies, v2

I don't think checking for brokenness after tokenizeBuffer() is enough. I think you also need to check after eof() as well, consider the following:

<html><body><!-- super long comment that may OOM and ends in EOF

This will call into appendComment via eof(), not tokenizeBuffer()

It is unfortunate that these methods will become more complicated to use due to the need for the check. I think adding an assertion in nsHtml5TreeBuilder::Flush to check for brokenness might help catch some bugs from incorrect usage.
Attachment #8625327 - Flags: review?(wchen) → review-
Attachment #8625327 - Attachment is obsolete: true
Attachment #8644272 - Flags: review?(wchen)
Comment on attachment 8644272 [details] [diff] [review]
Use fallible allocation for buffers copies, v3

Review of attachment 8644272 [details] [diff] [review]:
-----------------------------------------------------------------

I also noticed that in nsHtml5StringParser.cpp, there is a call to eof() that may end up triggering the assertion you added in this patch. In that file, it looks like if tokenizeBuffer fails, we stop calling tokenizeBuffer, but we still call eof() which may also generate tree ops into a broken builder. It seems to me it's good enough to just check if the tree builder is broken before calling eof().

::: parser/html/nsHtml5Parser.cpp
@@ +632,5 @@
>                         "This should only happen with script-created parser.");
>            if (NS_SUCCEEDED((rv = mExecutor->IsBroken()))) {
>              mTokenizer->eof();
> +            if (NS_FAILED((rv = mTreeBuilder->IsBroken()))) {
> +              return mExecutor->MarkAsBroken(rv);

The comment below (just before mTokenizer->end()) makes me think that we shouldn't return here. Before this patch, we would continue to run the code below if the executor is marked as broken, but now, if the eof() call fails, we mark the executor as broken and return. It's not clear to me why we change behavior for the broken case when it happens due to eof(). Perhaps we should just mark the executor broken without returning, and put the rest of the block in an else statement so the following code still gets run.

::: parser/html/nsHtml5StreamParser.cpp
@@ +1371,5 @@
>                mTokenizer->eof();
> +              nsresult rv;
> +              if (NS_FAILED((rv = mTreeBuilder->IsBroken()))) {
> +                MarkAsBroken(rv);
> +                return;

Similar to above, I'm not sure if we want to return early here because we end up not calling FlushTreeOpsAndDisarmTimer in the broken case, but before the patch we would call it even when the tree builder is broken. Maybe use the same solution as above and don't return and put the rest of the block in an else statement.

::: parser/html/nsHtml5TreeBuilderCppSupplement.h
@@ +1025,5 @@
> +      if (NS_FAILED(mBroken)) {
> +        MOZ_ASSERT(mOpQueue.Length() == 1,
> +          "Tree builder is broken with a non-empty op queue whose length isn't 1.");
> +        MOZ_ASSERT(mOpQueue[0].IsMarkAsBroken(),
> +          "Tree builde is broken but the op in queue is not mark as broken");

type-o in assertion message.

@@ +1031,3 @@
>        mOpSink->MoveOpsFrom(mOpQueue);
> +    } else {
> +      MOZ_ASSERT(NS_SUCCEEDED(mBroken),

I think this else statement may result in some bad assertions because we don't queue eTreeOpMarkAsBroken when the newly introduced fallible allocation fails, but it seems like we may still call flush on the tree builder even after being marked as broken (such as in nsHtml5Parser::ParseUntilBlocked in the mDocumentClosed case). I think the check above is enough to make sure that we aren't adding more things onto the op queue when it's broken.
Attachment #8644272 - Flags: review?(wchen) → review-
(In reply to William Chen [:wchen] from comment #7)
> Comment on attachment 8644272 [details] [diff] [review]
> Use fallible allocation for buffers copies, v3
> 
> Review of attachment 8644272 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I also noticed that in nsHtml5StringParser.cpp, there is a call to eof()
> that may end up triggering the assertion you added in this patch. In that
> file, it looks like if tokenizeBuffer fails, we stop calling tokenizeBuffer,
> but we still call eof() which may also generate tree ops into a broken
> builder. It seems to me it's good enough to just check if the tree builder
> is broken before calling eof().

The eof() call there is already wrapped in an NS_SUCCEEDED(rv) check and rv goes to failure if mBuilder gets broken.

> ::: parser/html/nsHtml5Parser.cpp
> @@ +632,5 @@
> >                         "This should only happen with script-created parser.");
> >            if (NS_SUCCEEDED((rv = mExecutor->IsBroken()))) {
> >              mTokenizer->eof();
> > +            if (NS_FAILED((rv = mTreeBuilder->IsBroken()))) {
> > +              return mExecutor->MarkAsBroken(rv);
> 
> The comment below (just before mTokenizer->end()) makes me think that we
> shouldn't return here. Before this patch, we would continue to run the code
> below if the executor is marked as broken, but now, if the eof() call fails,
> we mark the executor as broken and return. It's not clear to me why we
> change behavior for the broken case when it happens due to eof(). Perhaps we
> should just mark the executor broken without returning, and put the rest of
> the block in an else statement so the following code still gets run.

Done.

> ::: parser/html/nsHtml5StreamParser.cpp
> @@ +1371,5 @@
> >                mTokenizer->eof();
> > +              nsresult rv;
> > +              if (NS_FAILED((rv = mTreeBuilder->IsBroken()))) {
> > +                MarkAsBroken(rv);
> > +                return;
> 
> Similar to above, I'm not sure if we want to return early here because we
> end up not calling FlushTreeOpsAndDisarmTimer in the broken case, but before
> the patch we would call it even when the tree builder is broken. Maybe use
> the same solution as above and don't return and put the rest of the block in
> an else statement.

Done.

> ::: parser/html/nsHtml5TreeBuilderCppSupplement.h
> @@ +1025,5 @@
> > +      if (NS_FAILED(mBroken)) {
> > +        MOZ_ASSERT(mOpQueue.Length() == 1,
> > +          "Tree builder is broken with a non-empty op queue whose length isn't 1.");
> > +        MOZ_ASSERT(mOpQueue[0].IsMarkAsBroken(),
> > +          "Tree builde is broken but the op in queue is not mark as broken");
> 
> type-o in assertion message.

Fixed.

> @@ +1031,3 @@
> >        mOpSink->MoveOpsFrom(mOpQueue);
> > +    } else {
> > +      MOZ_ASSERT(NS_SUCCEEDED(mBroken),
> 
> I think this else statement may result in some bad assertions because we
> don't queue eTreeOpMarkAsBroken when the newly introduced fallible
> allocation fails, but it seems like we may still call flush on the tree
> builder even after being marked as broken (such as in
> nsHtml5Parser::ParseUntilBlocked in the mDocumentClosed case). I think the
> check above is enough to make sure that we aren't adding more things onto
> the op queue when it's broken.

Removed.

Thanks.
Attachment #8644272 - Attachment is obsolete: true
Attachment #8649742 - Flags: review?(wchen)
Attachment #8649742 - Flags: review?(wchen) → review+
https://hg.mozilla.org/mozilla-central/rev/c283d026937d
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
You need to log in before you can comment on or make changes to this bug.