Closed Bug 261974 Opened 20 years ago Closed 2 years ago

Infinite loop in SVG using JavaScript with redraw hangs browser

Categories

(Core :: SVG, defect)

x86
Windows 98
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: mozilla, Unassigned)

References

(Depends on 1 open bug, )

Details

(Keywords: hang, perf)

Attachments

(2 files)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040803 Firefox/0.9.3
Build Identifier: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.8a4) Gecko/20040925

A while(1){} statement within JavaScript in a SVG causes a dialog box to appear
asking if you want to terminate the script.  This is good.  What's bad is that
if the infinite loop contains calls to "document.createElementNS" and
"appendChild", then the infinite loop goes unchecked.  There's no way to
terminate the script without terminating the browser.

Reproducible: Always
Steps to Reproduce:
1. View the attached dom-crash.svg

Actual Results:  
Browser hangs and needs to be killed.

Expected Results:  
A dialog appears offering to stop executing the JavaScript.

The IE/Adobe conbination issues a redraw request after each modification to the
DOM, but an actual redraw doesn't occur until the current JavaScript thread
stops running.  By contrast Mozilla SVG appears to force a complete redraw after
every change.  This makes SVGs such as the one linked to in the URL render about
50 times slower.

I'm guessing that this call to rerender also resets whatever watchdog is in
place to catch runaway scripts.  So both these issues may have the same root cause.
hmm, why does this not get caught by the recursion check?
Maybe the DOMBranchCallback is not being called.  What JSContext is SVG using to
evaluate the script?

/be
nsSVGPathGeometryFrame::UpdateGraphic forces a sync paint for some reason (it's
called from nsSVGPathGeometryFrame::InitialUpdate).  Is there a good reason for
this?  It makes frame construction for SVG pretty miserably slow....

If I remove that part, I see about 30% of time being spent in
nsFrameList::LastChild (traversing the list of kids of the nsSVGDefsFrame to
append to it).

I suspect that all that's happening here is that we do enough layout crap that
it takes a long time to call the branch callback the requisite number of 
times... but I could be wrong about that.
The branch callback measures time now, thanks to nallen, so it shouldn't need to
be called a certain number of times -- just often enough and early enough.

/be
(In reply to comment #0)
> stops running.  By contrast Mozilla SVG appears to force a complete redraw after
> every change.  This makes SVGs such as the one linked to in the URL render about
> 50 times slower.

This behaviour is by design but should prpbably be changed (see below). You can
wrap your js code in between suspendRedraw/unsuspendRedraw calls to prevent
updates until the last nested unsuspendRedraw call.

(In reply to comment #4)
> nsSVGPathGeometryFrame::UpdateGraphic forces a sync paint for some reason (it's
> called from nsSVGPathGeometryFrame::InitialUpdate).  Is there a good reason for
> this?  

IIRC there was a good reason, but I can't remember the reason ;-) 
I think it had something to do with async painting not being fast or frequent
enough in mousemove event-handlers where svg content is being dragged around
(e.g. http://www.croczilla.com/svg/samples/events2/). Of course we could just
use forceRedraw in these cases...

> It makes frame construction for SVG pretty miserably slow....

How does this affect frame construction? We shouldn't issue sync redraws during
frame construction. Do we?
 
> If I remove that part, I see about 30% of time being spent in
> nsFrameList::LastChild (traversing the list of kids of the nsSVGDefsFrame to
> append to it).
> 
> I suspect that all that's happening here is that we do enough layout crap that
> it takes a long time to call the branch callback the requisite number of 
> times... but I could be wrong about that.

This patch will prevent sync refreshes unless forced. Unless code is wrapped in
suspendRedraw/unsuspendRedraw we still do much more work than needed (such as
rebuilding geometries each time to calculate the dirty region) but I guess
there's not much point in optimizing further, considering that it looks like at
some point in the future we'll move over to a rewritten immediate mode
rendering.
Attachment #160916 - Flags: review?(tor)
(In reply to comment #6)
> How does this affect frame construction? We shouldn't issue sync redraws
> during  frame construction. Do we?

Absolutely.  See 
nsSVGGenericContainerFrame::InsertFrames, nsSVGOuterSVGFrame::InsertFrames, etc.
 These are all called directly from frame construction when content nodes are
inserted into the document and call InitialUpdate() on the new frames, which
does a sync repaint.
(In reply to comment #8)
> (In reply to comment #6)
> > How does this affect frame construction? We shouldn't issue sync redraws
> > during  frame construction. Do we?
> 
> Absolutely.  See 
> nsSVGGenericContainerFrame::InsertFrames, nsSVGOuterSVGFrame::InsertFrames, etc.
>  These are all called directly from frame construction when content nodes are
> inserted into the document and call InitialUpdate() on the new frames, which
> does a sync repaint.

Ah, I see. I thought you meant during 'normal' frameconstruction (i.e.
SetInitialChildList).
The loop in InsertFrames should really be wrapped with
suspendRedraw/unsuspendRedraw calls, but I guess with the async patch that's not
necessary anymore.
Attachment #160916 - Flags: review?(tor) → review+
Checked in at afri's request:

Checking in content/svg/content/src/nsSVGSVGElement.cpp;
/cvsroot/mozilla/content/svg/content/src/nsSVGSVGElement.cpp,v  <-- 
nsSVGSVGElement.cpp
new revision: 1.35; previous revision: 1.34
done
Checking in layout/svg/base/src/nsSVGOuterSVGFrame.cpp;
/cvsroot/mozilla/layout/svg/base/src/nsSVGOuterSVGFrame.cpp,v  <-- 
nsSVGOuterSVGFrame.cpp
new revision: 1.26; previous revision: 1.25
done
(In reply to comment #5)
> The branch callback measures time now, thanks to nallen, so it shouldn't need to
> be called a certain number of times -- just often enough and early enough.
> 
> /be

So what needs to happen to fix this now?
Status: UNCONFIRMED → NEW
Ever confirmed: true
OK, I just looked at the calls to nsJSContext::DOMBranchCallback.  On that
testcase, on my machine (P3-733 processor) it takes about 7 seconds to make 1000
callbacks.

Since the code only checks for a loop once every 32000-some callbacks, it would
be about 64000 callbacks before it put up the dialog.  That translates to about
7 minutes of waiting.

I would expect that we get one callback per time through that loop in the
testcase, and a single iteration of that loop just takes too much time... (even
with the async patch that landed here).

If I change MAYBE_STOP_BRANCH_COUNT_MASK and MAYBE_STOP_BRANCH_COUNT_MASK to
0x0000007f (so two orders of magnitude less, basically), I get the "script is
running too long" dialog about 5 seconds after loading the testcase (as in, as
soon as we hit the dom.max_script_run_time mark.

I'll profile to see what's still taking up so much time, but maybe we can tweak
something on the other end too?
OK.  Profile is as follows:

Total hit count: 255529
Of these: 226145 under XPCWrappedNative::CallMethod
  Of these:  13723 under nsScriptSecurityManager::CanAccess
            26972 under XPCConvert::NativeInterface2JSObject
            170352 under XPTC_InvokeByIndex

The breakdown for XPTC_InvokeByIndex breakdown is:

  138331 under nsSVGGElement::AppendChild
   20646 under nsSVGRectElement::SetAttribute
   11046 nsSVGDocument::CreateElementNS

Looking from the other end, 17% of the total time is spent in (not under, in)
nsFrameList::LastChild (which is O(N) in number of kids, and makes the appending
each child O(N) in the number of kids already there, which makes the frequency
of branch callbacks fall as 1/N).

So one immediate problem here is that if the frequency of callbacks is falling
as 1/N in the number of callbacks then the time to get to a particular number of
callbacks is O(N^2) in that number....

Bug 233463 covers the LastChild thing, sorta.

I'll probably file separate bugs on other performance issues, if any, that I see
in the testcase.
Depends on: 233463
Depends on: 270251
Depends on: 270255
Depends on: 270257
Depends on: 270264
bz: Your recent comments seem at odds with what brendan said in comment 5,
unless I'm misunderstanding something. Surely, if this is all time-based, as
long as the callbacks are being fired at least once a second, it doesn't matter
exactly how often they are being called?
> bz: Your recent comments seem at odds with what brendan said in comment 5,

Note the "early enough and often enough" part of Brendan's comment.  In this
case, we are NOT calling "often enough".  See also the comment at
http://lxr.mozilla.org/seamonkey/source/dom/src/base/nsJSEnvironment.cpp#466
MAYBE_STOP_BRANCH_COUNT_MASK can probably be tweaked down a bit (but not too
much) to see if it can deal better.  There are a couple of constraints though:

Real world scripts (typically less than 1K callbacks) shouldn't have to pay
anything for this "infinite loop checking".

Pure JS in a tight loop can hit 10s of millions of callbacks per second.  A
callback has to do essentially zero work on average to avoid hurting performance.

If there was some way to communicate that a large amount of work has been done
outside of JS, it might be possible to react to that.
Note that the five dependencies of this bug should, if fixed, reduce the amount
of time spent in a callback on that testcase by an order of magnitude or so. 
That would help some -- instead of waiting 7 minutes we would be waiting 40
seconds...
jst already has a patch in progress -- see bug 237977 comment 26.  I'll let
someone else mark the dependency.

/be
Depends on: 237977
bz: aha, gotcha. thanks for the pointers.
User-Agent: Mozilla/5.0 (X11; U; Linux i686; es-ES; rv:1.7.10) Gecko/20050730
Firefox/1.0.6 (Debian package 1.0.6-2)

I have a similar problem when I load this:

<html>
<body>
<script language="javascript">
while (1) {
document.write('<img src="none.jpg"/>');
}
</script>
</body>
</html>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3

If i open the attachment the message box to terminate the script appears (after aproximatly 15 seconds). so this sems to be fixed in FF 2.0.0.3. Of course i do not know if all the other issues in this bug are fixed.

The code of Daniel Larraz does'nt produce an infite loop for me.

User Agent see above.
Lukas, it should be appearing after 10 seconds.  The fact that it's not is basically this bug.  Note that on a slower machine than yours the difference will be even more.
Assignee: general → nobody
QA Contact: ian → general
Even with bug 233463 and the watchdog thread this is _still_ a problem.  And I still see sync painting from frame construction stuff here.....
Er, I guess it's just sync invalidation, not sync painting.
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:2.0b1) Gecko/20100630 Firefox/4.0b1

Still happening :(
Sure; the bugs blocking this one are still open.  ;)
Keywords: perf
Keywords: hang

(In reply to Neil Fraser from comment #0)

A while(1){} statement within JavaScript in a SVG causes a dialog box to
appear
asking if you want to terminate the script. This is good. What's bad is
that
if the infinite loop contains calls to "document.createElementNS" and
"appendChild", then the infinite loop goes unchecked. There's no way to
terminate the script without terminating the browser.

Two updates from testing in 2022-era Firefox:

  • We do now show a "This page is slowing down Firefox" notification (if you try to interact with the page, at least)
  • ...though interestingly: if I click that notification's "Stop" button, the content process remains pegged at 100% CPU with memory-usage still ticking ever-upwards.
  • ...however, it looks like we did really stop the script. Memory usage and CPU is just thrashing because we're now left with an absurdly giant DOM which we attempt to render.
  • And then the content process ultimately crashes with an out-of-memory exception, on my machine at least (which turns just that one tab into an error page, in our modern e10s-enabled world, essentially; and the rest of the browser is fine). This is to-be-expected for a page of the size that we're left with at the point where I was able to interrupt the script, I think.

This produces a crash report like bp-f16f5f95-ee35-4c33-9e14-ce8b50220818 (from me just now).

So I think this is essentially WORKSFORME at this point; the hang is still there but it's not as problematic given that we do show the "Stop" button, and we do kill the content process if it's sufficiently misbehaving (without taking down the rest of the browser).

Severity: critical → S4
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: