Last Comment Bug 694165 - Regression: some SVG images hang browser when loaded via data URI
: Regression: some SVG images hang browser when loaded via data URI
Status: VERIFIED FIXED
[verified-aurora] [verified-beta]
: hang, regression, testcase
Product: Core
Classification: Components
Component: SVG (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla10
Assigned To: Daniel Holbert [:dholbert]
:
Mentors:
Depends on:
Blocks: 656244
  Show dependency treegraph
 
Reported: 2011-10-12 15:24 PDT by Brion Vibber
Modified: 2011-11-21 00:05 PST (History)
5 users (show)
dholbert: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
affected
fixed
fixed
fixed


Attachments
HTML file with an inline data: URL svg image -- hangs Firefox 7/8/9 (1.95 MB, text/html)
2011-10-12 15:24 PDT, Brion Vibber
no flags Details
HTML file loading the same image via HTTP (98 bytes, text/html)
2011-10-12 15:24 PDT, Brion Vibber
no flags Details
File-selection demo; crashes on some but not all files (1.68 KB, text/html)
2011-10-12 15:26 PDT, Brion Vibber
no flags Details
Sample SVG file that is used above, in case it gets lost from Wikimedia Commons (1.47 MB, image/svg+xml)
2011-10-12 15:28 PDT, Brion Vibber
no flags Details
fix: cancel pending events (1.99 KB, patch)
2011-10-13 14:29 PDT, Daniel Holbert [:dholbert]
no flags Details | Diff | Splinter Review
fix: cancel pending events (1.38 KB, patch)
2011-10-13 14:34 PDT, Daniel Holbert [:dholbert]
hsivonen: review+
christian: approval‑mozilla‑aurora+
christian: approval‑mozilla‑beta+
Details | Diff | Splinter Review
crashtest (20.33 KB, patch)
2011-10-13 18:36 PDT, Daniel Holbert [:dholbert]
hsivonen: review+
Details | Diff | Splinter Review

Description Brion Vibber 2011-10-12 15:24:17 PDT
Created attachment 566649 [details]
HTML file with an inline data: URL svg image -- hangs Firefox 7/8/9

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
Build ID: 20110928134238

Steps to reproduce:

Wikipedia has recently added a thumbnail preview before image file upload, using FileReader to load the image and scaling/rotating it via canvas after it's been selected by an <input type="file">.

We've found that some images, at least some SVG files, cause Firefox 7.0.1, 8beta, and 9/aurora to hang during this process, while Firefox 6.0.2 and Chrome 14 on the same machine display the thumbnail as expected.


Actual results:

I tracked the hang down to loading the image via a data: URI.

This happens both dynamically (via FileReader, creating an Image() and setting the src attribute) and statically by simply putting an <img src="data:...."> into an HTML page.

Not all SVG images seem to cause the hang, but the sample one does consistently for me -- taken from https://commons.wikimedia.org/wiki/File:Acadie_g%C3%A9n%C3%A9alogique.svg

The attached sample page will hang Firefox 7, 8, or 9.


Expected results:

The attached sample page should show a world map with some blue highlights, as it does in Firefox 6 and Chrome 14.
Comment 1 Brion Vibber 2011-10-12 15:24:59 PDT
Created attachment 566653 [details]
HTML file loading the same image via HTTP

This version does not hang.
Comment 2 Brion Vibber 2011-10-12 15:26:31 PDT
Created attachment 566655 [details]
File-selection demo; crashes on some but not all files

Exhibits the same crash when uploading the crashy image file; data URI is loaded in via new Image() & setting image.src.
Comment 3 Brion Vibber 2011-10-12 15:27:47 PDT
Comment on attachment 566653 [details]
HTML file loading the same image via HTTP

><img src="https://upload.wikimedia.org/wikipedia/commons/e/ec/Acadie_g%C3%A9n%C3%A9alogique.svg">
Comment 4 Brion Vibber 2011-10-12 15:28:44 PDT
Created attachment 566657 [details]
Sample SVG file that is used above, in case it gets lost from Wikimedia Commons

from https://commons.wikimedia.org/wiki/File:Acadie_généalogique.svg
Comment 5 Daniel Holbert [:dholbert] 2011-10-12 16:04:51 PDT
Confirmed using the first attachment in my debug trunk build on Linux x86_64.
Comment 6 Daniel Holbert [:dholbert] 2011-10-12 16:07:19 PDT
Appears to be a regression from bug 656244:
 http://hg.mozilla.org/mozilla-central/rev/bdd4d79f9836
At least -- the hang is from never exiting the "while" loop that was created in that bug's cset.
Comment 7 Daniel Holbert [:dholbert] 2011-10-12 16:36:56 PDT
So here's what's going on:
If the final chunk of SVG data takes too long to parse, the nsParser graciously calls PostContinueEvent(), which posts a runnable that will clear that flag and continue parsing at a later time.  While this runnable is posted, "nsParser->IsComplete()" returns false.

However, unbeknownst to the parser, SVGDocumentWrapper is going to force it to continue immediately.  So it actually gets to finish parsing without the runnable ever needing to fire.  BUT, the runnable prevents IsComplete() from returning true -- so we keep looping forever (and we never return to the event loop to let the runnable fire.

So -- instead of the evil while() loop from bug 656244, I think we *actually* want to call nsParser::SetInterruptible(PR_FALSE).  I *think* that'll make the parser synchronously finish its business on its own, without posting any runnables or returning early.

We can't quite do that, though, since SetInterruptible isn't on the nsIParser interface (and it's only called internally, despite being a public nsParser method).  There's probably a not-too-messy way to trigger a call to SetInterruptible(PR_FALSE) for this situation, though.... (might require a special-case in nsParser.cpp)
Comment 8 Henri Sivonen (:hsivonen) 2011-10-12 23:32:47 PDT
(In reply to Daniel Holbert [:dholbert] from comment #7)
> If the final chunk of SVG data takes too long to parse, the nsParser
> graciously calls PostContinueEvent(), which posts a runnable that will clear
> that flag and continue parsing at a later time.  While this runnable is
> posted, "nsParser->IsComplete()" returns false.

This is only about SVG in <img>, not about SVG from data:, right? And the reason why the bug has been filed about data: URLs is that it happens that the entire file is in the final chunk in the data: case?

> However, unbeknownst to the parser, SVGDocumentWrapper is going to force it
> to continue immediately.

I really wish the synchronicity assumption of SVGDocumentWrapper goes away before I put the XML parser off the main thread...

> We can't quite do that, though, since SetInterruptible isn't on the
> nsIParser interface

Please edit nsIParser as needed instead of trying to work around nsIParser. For all practical purposes, nsIParser is totally Gecko-internal code. It's so tightly coupled with Gecko internals that it's useless and highly inappropriate for extensions to try to use the interface.
Comment 9 Daniel Holbert [:dholbert] 2011-10-13 12:01:48 PDT
(In reply to Henri Sivonen (:hsivonen) from comment #8)
> This is only about SVG in <img>, not about SVG from data:, right?

Yes -- it's about SVG in <img>, and it's just particularly likely to be triggered for data: URLs because we get the data off of the stream pretty much all at once.

> I really wish the synchronicity assumption of SVGDocumentWrapper goes away
> before I put the XML parser off the main thread...

Me too :-/  Per bug 656244 comment 30, that still requires some nontrivial imagelib API refactoring.  I need to make sure there's a bug filed on doing that...

> Please edit nsIParser as needed instead of trying to work around nsIParser.
> For all practical purposes, nsIParser is totally Gecko-internal code.

I'm so glad you said that. :)
Comment 10 Daniel Holbert [:dholbert] 2011-10-13 14:29:25 PDT
Created attachment 566940 [details] [diff] [review]
fix: cancel pending events

Actually, while looking at nsIParser.h, I realized there's a simpler solution -- we can just call CancelParsingEvents() to clear the event & the flag that blocks us from being marked as completed. (since we're handling its work synchronously anyway)

This is slightly hacky, but I think it's better than modifying the nsIParser API / inner-workings, given that SVG image parsing will hopefully be more async-friendly (and no longer need any nsIParser modifications) in the future.
Comment 11 Daniel Holbert [:dholbert] 2011-10-13 14:30:14 PDT
Comment on attachment 566940 [details] [diff] [review]
fix: cancel pending events

(sorry, attached the wrong patch version.  Correct patch coming up.)
Comment 12 Daniel Holbert [:dholbert] 2011-10-13 14:34:25 PDT
Created attachment 566941 [details] [diff] [review]
fix: cancel pending events
Comment 13 Daniel Holbert [:dholbert] 2011-10-13 14:35:52 PDT
The other upside of this strategy is that it's minimal enough to be considerable for branch inclusion.  It'd be great if we could get this into Aurora / Beta...

Setting tracking flags to get this on the radar for those.
Comment 14 Daniel Holbert [:dholbert] 2011-10-13 18:36:20 PDT
Created attachment 566994 [details] [diff] [review]
crashtest

I was initially going to create a crashtest based on the first attachment here, but it's a ~2MB file, which is kind of a large addition to our source code.

So instead, I created a smaller test (20KB), with 300,000 <g/> elements in an data URI.  This is sufficient to trigger the hang on my machine. (I'm running it through TryServer with & without the fix, to be sure it triggers the hang reliably on other platforms, too.)

(Note: I reduced the test's file-size by defining an XML entity for 1000 <g/> elements, and then I quoted that entity 300 times inside the data URI.  This reduces the file-size, without reducing the amount of SVG data sent to the parser.)
Comment 15 Daniel Holbert [:dholbert] 2011-10-13 21:15:09 PDT
Comment on attachment 566994 [details] [diff] [review]
crashtest

(In reply to Daniel Holbert [:dholbert] from comment #14)
> (I'm running it through TryServer with & without the fix, to be sure it triggers
> the hang reliably on other platforms, too.)

Crashtest appears to successfully test for this on tryserver. Requesting review.
Without the fix, the crashtest is orange from a hang on all platforms so far:
https://tbpl.mozilla.org/?tree=Try&rev=0762a4443dc1

With the fix, crashtest is green on all platforms so far:
https://tbpl.mozilla.org/?tree=Try&rev=a78fde4f676b
Comment 16 Henri Sivonen (:hsivonen) 2011-10-14 05:16:17 PDT
Comment on attachment 566941 [details] [diff] [review]
fix: cancel pending events

>+      parser->CancelParsingEvents();
>       parser->ContinueInterruptedParsing();

Based on the description of the problem, I'd expect these lines to be in the opposite order.

If the test passes with these lines in the opposite order, please change the order of these lines, and r=me.

If swapping the order doesn't actually work, I trust you that this works and r=me anyway, but I'd like to know why the more intuitive order doesn't work.
Comment 17 Daniel Holbert [:dholbert] 2011-10-14 10:39:12 PDT
So -- the test doesn't hang if I reverse the lines, but I think my ordering is more correct, actually.

If I reverse the lines as you suggest, then the CancelParsingEvents() call will cancel any events posted by ContinueInterruptedParsing(), and we'll subsequently break out of the loop early (since IsComplete() won't see any pending events), even if we're not done. In fact, we'd only ever run through the loop once -- we'd never loop back across it.  This would potentially leave some parsing unfinished, which is bad.

In contrast, the logic behind my ordering is:
> checkpoint:
> if (we have stuff left to do) {
>    // cancel the pending event to do that stuff...
>    // ... and just do the stuff now instead. (If this doesn't finish, it'll post a new event.)
>    // goto checkpoint

So -- are you OK with me landing this as-is?
Comment 18 Daniel Holbert [:dholbert] 2011-10-14 11:02:41 PDT
SIDE NOTE: When writing the previous comment, I was wondering "Wait, how did the existing code ever work?", and it turns out it _doesn't_ work, except that there's a race condition that saves us.

If our initial call to nsParser::OnStopRequest() (the top line of the patch's contextual code) doesn't finish, it calls PostContinueEvent(), and from that point on we're screwed, because we have a pending event that we never cancel.  (so IsComplete() will be perma-false and will never let us out of the "while" loop).

*However*, in some cases (e.g. the testcase in bug 656244) we get lucky, and at some point while parsing, nsParser::Tokenize calls nsParser::Terminate, which mercifully calls CancelParsingEvents ("to avoid leaking the nsParser") and clears our clogged event.  I say "get lucky", though, because there's some race condition involved with triggering that escape-hatch -- it never happens when I step through the loop in GDB (we just loop forever) -- it only happens only when I let the loop run on its own.  (again, this was from me re-running the testcase from bug 656244, but under finer scrutiny in GDB)
Comment 19 Daniel Holbert [:dholbert] 2011-10-14 11:11:12 PDT
(In reply to Daniel Holbert [:dholbert] from comment #17)
> So -- are you OK with me landing this as-is?

Actually, since I think it's late and near-weekend in your timezone, I'm going to take the "r=me anyway" offered at the end of comment 16:
> If swapping the order doesn't actually work, I trust you that this works and
> r=me anyway, but I'd like to know why the more intuitive order doesn't work.

...and hopefully comment 17 serves as an answer to that last part. :)
Comment 21 Robert Longson 2011-10-14 11:34:22 PDT
do we want to consider this for beta?
Comment 22 Daniel Holbert [:dholbert] 2011-10-14 12:11:36 PDT
Comment on attachment 566941 [details] [diff] [review]
fix: cancel pending events

I think we should consider it for beta, yeah.  (I thought I'd already requested approval, but just realized I hadn't :))

Requesting aurora and beta approval.

Low-risk patch; fixes a regression that makes Firefox hang, and includes tests.
Comment 23 Brion Vibber 2011-10-14 14:41:49 PDT
You guys work fast, thanks! :)

Is there any chance of this going into a Firefox 7 point release, or should I assume it'll only land in 8 and above? It's easy enough to blacklist FF 7 from the thumbnailing that's triggering the bug for us on Wikipedia if I can't find a workaround (I've just disabled it outright for SVGs for the moment).

Thanks again -- it's very satisfying to watch the fix progress!
Comment 24 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2011-10-14 14:43:35 PDT
This will not be going into a Firefox 7 point release (because there won't be one).  Firefox 8.0.0 is the earliest release that it could be in.
Comment 25 Brion Vibber 2011-10-14 14:47:35 PDT
Thanks, that'll be just fine for us. :)
Comment 26 Daniel Holbert [:dholbert] 2011-10-14 14:49:54 PDT
Yup, what khuey said.

(In reply to Brion Vibber from comment #23)
> Thanks again -- it's very satisfying to watch the fix progress!

Thank you for the bug report (and sorry for the breakage!) :)
Comment 27 Ed Morley [:emorley] 2011-10-15 05:31:08 PDT
https://hg.mozilla.org/mozilla-central/rev/0c1cc83120a1
https://hg.mozilla.org/mozilla-central/rev/84adc3f8b050

Brion: This has now landed on mozilla-central and so will be present in the next nightly (2011-10-16 from http://nightly.mozilla.org). The aurora and beta landings will occur once/if the approval requests in comment 22 are granted. 

Thanks again for the bug report :-)
Comment 28 christian 2011-10-18 10:26:17 PDT
Comment on attachment 566941 [details] [diff] [review]
fix: cancel pending events

Approved for aurora and beta
Comment 30 Mihaela Velimiroviciu (:mihaelav) 2011-10-28 08:04:51 PDT
Mozilla/5.0 (Windows NT 6.1; rv:8.0) Gecko/20100101 Firefox/8.0 - beta 5
Mozilla/5.0 (Windows NT 6.1; rv:9.0a2) Gecko/20111025 Firefox/9.0a2 - build id: 20111027042020
Mozilla/5.0 (Windows NT 6.1; rv:10.0a1) Gecko/20111021 Firefox/10.0a1 - build id: 20111028031044

Mozilla/5.0 (Windows NT 5.1; rv:8.0) Gecko/20100101 Firefox/8.0
Mozilla/5.0 (Windows NT 5.1; rv:10.0a1) Gecko/20111027 Firefox/10.0a1
Mozilla/5.0 (Windows NT 5.1; rv:9.0a2) Gecko/20111027 Firefox/9.0a2

Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0a1) Gecko/20111027 Firefox/10.0a1
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0) Gecko/20100101 Firefox/8.0
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:9.0a2) Gecko/20111020 Firefox/9.0a2

Mozilla/5.0 (X11; Linux x86_64; rv:8.0) Gecko/20100101 Firefox/8.0
Mozilla/5.0 (X11; Linux x86_64; rv:10.0a1) Gecko/20111027 Firefox/10.0a1
Mozilla/5.0 (X11; Linux x86_64; rv:9.0a2) Gecko/20111027 Firefox/9.0a2

Verified on Win7 x64 with the above builds.
While SVG image is loading, there is a temporary hang of browser, but it recovers in a few seconds (less than 5).
Comment 31 Daniel Holbert [:dholbert] 2011-11-21 00:05:09 PST
(In reply to Daniel Holbert [:dholbert] from comment #9)
> Me too :-/  Per bug 656244 comment 30, that still requires some nontrivial
> imagelib API refactoring.  I need to make sure there's a bug filed on doing
> that...

BTW, I just filed bug 704059 on doing this refactoring, to be sure it's being tracked somewhere. (Not sure if there was already a bug filed on that; if so, mine can be duped to the original.)

Note You need to log in before you can comment on or make changes to this bug.