Closed Bug 694165 Opened 13 years ago Closed 13 years ago

Regression: some SVG images hang browser when loaded via data URI

Categories

(Core :: SVG, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla10
Tracking Status
firefox7 --- affected
firefox8 --- fixed
firefox9 --- fixed
firefox10 --- fixed

People

(Reporter: brion, Assigned: dholbert)

References

Details

(Keywords: hang, regression, testcase, Whiteboard: [verified-aurora] [verified-beta])

Attachments

(6 files, 1 obsolete file)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
Build ID: 20110928134238

Steps to reproduce:

Wikipedia has recently added a thumbnail preview before image file upload, using FileReader to load the image and scaling/rotating it via canvas after it's been selected by an <input type="file">.

We've found that some images, at least some SVG files, cause Firefox 7.0.1, 8beta, and 9/aurora to hang during this process, while Firefox 6.0.2 and Chrome 14 on the same machine display the thumbnail as expected.


Actual results:

I tracked the hang down to loading the image via a data: URI.

This happens both dynamically (via FileReader, creating an Image() and setting the src attribute) and statically by simply putting an <img src="data:...."> into an HTML page.

Not all SVG images seem to cause the hang, but the sample one does consistently for me -- taken from https://commons.wikimedia.org/wiki/File:Acadie_g%C3%A9n%C3%A9alogique.svg

The attached sample page will hang Firefox 7, 8, or 9.


Expected results:

The attached sample page should show a world map with some blue highlights, as it does in Firefox 6 and Chrome 14.
This version does not hang.
Exhibits the same crash when uploading the crashy image file; data URI is loaded in via new Image() & setting image.src.
Attachment #566655 - Attachment mime type: text/plain → text/html
Attachment #566649 - Attachment mime type: text/plain → text/html
Comment on attachment 566653 [details]
HTML file loading the same image via HTTP

><img src="https://upload.wikimedia.org/wikipedia/commons/e/ec/Acadie_g%C3%A9n%C3%A9alogique.svg">
Attachment #566653 - Attachment mime type: text/plain → text/html
Component: General → SVG
Product: Firefox → Core
QA Contact: general → general
Confirmed using the first attachment in my debug trunk build on Linux x86_64.
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Mac OS X → All
Hardware: x86 → All
Version: 7 Branch → Trunk
Appears to be a regression from bug 656244:
 http://hg.mozilla.org/mozilla-central/rev/bdd4d79f9836
At least -- the hang is from never exiting the "while" loop that was created in that bug's cset.
Blocks: 656244
Assignee: nobody → dholbert
Status: NEW → ASSIGNED
So here's what's going on:
If the final chunk of SVG data takes too long to parse, the nsParser graciously calls PostContinueEvent(), which posts a runnable that will clear that flag and continue parsing at a later time.  While this runnable is posted, "nsParser->IsComplete()" returns false.

However, unbeknownst to the parser, SVGDocumentWrapper is going to force it to continue immediately.  So it actually gets to finish parsing without the runnable ever needing to fire.  BUT, the runnable prevents IsComplete() from returning true -- so we keep looping forever (and we never return to the event loop to let the runnable fire.

So -- instead of the evil while() loop from bug 656244, I think we *actually* want to call nsParser::SetInterruptible(PR_FALSE).  I *think* that'll make the parser synchronously finish its business on its own, without posting any runnables or returning early.

We can't quite do that, though, since SetInterruptible isn't on the nsIParser interface (and it's only called internally, despite being a public nsParser method).  There's probably a not-too-messy way to trigger a call to SetInterruptible(PR_FALSE) for this situation, though.... (might require a special-case in nsParser.cpp)
(In reply to Daniel Holbert [:dholbert] from comment #7)
> If the final chunk of SVG data takes too long to parse, the nsParser
> graciously calls PostContinueEvent(), which posts a runnable that will clear
> that flag and continue parsing at a later time.  While this runnable is
> posted, "nsParser->IsComplete()" returns false.

This is only about SVG in <img>, not about SVG from data:, right? And the reason why the bug has been filed about data: URLs is that it happens that the entire file is in the final chunk in the data: case?

> However, unbeknownst to the parser, SVGDocumentWrapper is going to force it
> to continue immediately.

I really wish the synchronicity assumption of SVGDocumentWrapper goes away before I put the XML parser off the main thread...

> We can't quite do that, though, since SetInterruptible isn't on the
> nsIParser interface

Please edit nsIParser as needed instead of trying to work around nsIParser. For all practical purposes, nsIParser is totally Gecko-internal code. It's so tightly coupled with Gecko internals that it's useless and highly inappropriate for extensions to try to use the interface.
Keywords: hang, regression
(In reply to Henri Sivonen (:hsivonen) from comment #8)
> This is only about SVG in <img>, not about SVG from data:, right?

Yes -- it's about SVG in <img>, and it's just particularly likely to be triggered for data: URLs because we get the data off of the stream pretty much all at once.

> I really wish the synchronicity assumption of SVGDocumentWrapper goes away
> before I put the XML parser off the main thread...

Me too :-/  Per bug 656244 comment 30, that still requires some nontrivial imagelib API refactoring.  I need to make sure there's a bug filed on doing that...

> Please edit nsIParser as needed instead of trying to work around nsIParser.
> For all practical purposes, nsIParser is totally Gecko-internal code.

I'm so glad you said that. :)
Attached patch fix: cancel pending events (obsolete) — Splinter Review
Actually, while looking at nsIParser.h, I realized there's a simpler solution -- we can just call CancelParsingEvents() to clear the event & the flag that blocks us from being marked as completed. (since we're handling its work synchronously anyway)

This is slightly hacky, but I think it's better than modifying the nsIParser API / inner-workings, given that SVG image parsing will hopefully be more async-friendly (and no longer need any nsIParser modifications) in the future.
Attachment #566940 - Flags: review?(hsivonen)
Comment on attachment 566940 [details] [diff] [review]
fix: cancel pending events

(sorry, attached the wrong patch version.  Correct patch coming up.)
Attachment #566940 - Attachment is obsolete: true
Attachment #566940 - Flags: review?(hsivonen)
Attachment #566941 - Flags: review?(hsivonen)
The other upside of this strategy is that it's minimal enough to be considerable for branch inclusion.  It'd be great if we could get this into Aurora / Beta...

Setting tracking flags to get this on the radar for those.
Keywords: testcase
Attached patch crashtestSplinter Review
I was initially going to create a crashtest based on the first attachment here, but it's a ~2MB file, which is kind of a large addition to our source code.

So instead, I created a smaller test (20KB), with 300,000 <g/> elements in an data URI.  This is sufficient to trigger the hang on my machine. (I'm running it through TryServer with & without the fix, to be sure it triggers the hang reliably on other platforms, too.)

(Note: I reduced the test's file-size by defining an XML entity for 1000 <g/> elements, and then I quoted that entity 300 times inside the data URI.  This reduces the file-size, without reducing the amount of SVG data sent to the parser.)
Comment on attachment 566994 [details] [diff] [review]
crashtest

(In reply to Daniel Holbert [:dholbert] from comment #14)
> (I'm running it through TryServer with & without the fix, to be sure it triggers
> the hang reliably on other platforms, too.)

Crashtest appears to successfully test for this on tryserver. Requesting review.
Without the fix, the crashtest is orange from a hang on all platforms so far:
https://tbpl.mozilla.org/?tree=Try&rev=0762a4443dc1

With the fix, crashtest is green on all platforms so far:
https://tbpl.mozilla.org/?tree=Try&rev=a78fde4f676b
Attachment #566994 - Flags: review?(hsivonen)
Attachment #566994 - Flags: review?(hsivonen) → review+
Comment on attachment 566941 [details] [diff] [review]
fix: cancel pending events

>+      parser->CancelParsingEvents();
>       parser->ContinueInterruptedParsing();

Based on the description of the problem, I'd expect these lines to be in the opposite order.

If the test passes with these lines in the opposite order, please change the order of these lines, and r=me.

If swapping the order doesn't actually work, I trust you that this works and r=me anyway, but I'd like to know why the more intuitive order doesn't work.
Attachment #566941 - Flags: review?(hsivonen) → review+
So -- the test doesn't hang if I reverse the lines, but I think my ordering is more correct, actually.

If I reverse the lines as you suggest, then the CancelParsingEvents() call will cancel any events posted by ContinueInterruptedParsing(), and we'll subsequently break out of the loop early (since IsComplete() won't see any pending events), even if we're not done. In fact, we'd only ever run through the loop once -- we'd never loop back across it.  This would potentially leave some parsing unfinished, which is bad.

In contrast, the logic behind my ordering is:
> checkpoint:
> if (we have stuff left to do) {
>    // cancel the pending event to do that stuff...
>    // ... and just do the stuff now instead. (If this doesn't finish, it'll post a new event.)
>    // goto checkpoint

So -- are you OK with me landing this as-is?
SIDE NOTE: When writing the previous comment, I was wondering "Wait, how did the existing code ever work?", and it turns out it _doesn't_ work, except that there's a race condition that saves us.

If our initial call to nsParser::OnStopRequest() (the top line of the patch's contextual code) doesn't finish, it calls PostContinueEvent(), and from that point on we're screwed, because we have a pending event that we never cancel.  (so IsComplete() will be perma-false and will never let us out of the "while" loop).

*However*, in some cases (e.g. the testcase in bug 656244) we get lucky, and at some point while parsing, nsParser::Tokenize calls nsParser::Terminate, which mercifully calls CancelParsingEvents ("to avoid leaking the nsParser") and clears our clogged event.  I say "get lucky", though, because there's some race condition involved with triggering that escape-hatch -- it never happens when I step through the loop in GDB (we just loop forever) -- it only happens only when I let the loop run on its own.  (again, this was from me re-running the testcase from bug 656244, but under finer scrutiny in GDB)
(In reply to Daniel Holbert [:dholbert] from comment #17)
> So -- are you OK with me landing this as-is?

Actually, since I think it's late and near-weekend in your timezone, I'm going to take the "r=me anyway" offered at the end of comment 16:
> If swapping the order doesn't actually work, I trust you that this works and
> r=me anyway, but I'd like to know why the more intuitive order doesn't work.

...and hopefully comment 17 serves as an answer to that last part. :)
Landed on inbound:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0c1cc83120a1
https://hg.mozilla.org/integration/mozilla-inbound/rev/84adc3f8b050
Flags: in-testsuite+
Whiteboard: [inbound]
Target Milestone: --- → mozilla10
do we want to consider this for beta?
Comment on attachment 566941 [details] [diff] [review]
fix: cancel pending events

I think we should consider it for beta, yeah.  (I thought I'd already requested approval, but just realized I hadn't :))

Requesting aurora and beta approval.

Low-risk patch; fixes a regression that makes Firefox hang, and includes tests.
Attachment #566941 - Flags: approval-mozilla-beta?
Attachment #566941 - Flags: approval-mozilla-aurora?
You guys work fast, thanks! :)

Is there any chance of this going into a Firefox 7 point release, or should I assume it'll only land in 8 and above? It's easy enough to blacklist FF 7 from the thumbnailing that's triggering the bug for us on Wikipedia if I can't find a workaround (I've just disabled it outright for SVGs for the moment).

Thanks again -- it's very satisfying to watch the fix progress!
This will not be going into a Firefox 7 point release (because there won't be one).  Firefox 8.0.0 is the earliest release that it could be in.
Thanks, that'll be just fine for us. :)
Yup, what khuey said.

(In reply to Brion Vibber from comment #23)
> Thanks again -- it's very satisfying to watch the fix progress!

Thank you for the bug report (and sorry for the breakage!) :)
https://hg.mozilla.org/mozilla-central/rev/0c1cc83120a1
https://hg.mozilla.org/mozilla-central/rev/84adc3f8b050

Brion: This has now landed on mozilla-central and so will be present in the next nightly (2011-10-16 from http://nightly.mozilla.org). The aurora and beta landings will occur once/if the approval requests in comment 22 are granted. 

Thanks again for the bug report :-)
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Comment on attachment 566941 [details] [diff] [review]
fix: cancel pending events

Approved for aurora and beta
Attachment #566941 - Flags: approval-mozilla-beta?
Attachment #566941 - Flags: approval-mozilla-beta+
Attachment #566941 - Flags: approval-mozilla-aurora?
Attachment #566941 - Flags: approval-mozilla-aurora+
Mozilla/5.0 (Windows NT 6.1; rv:8.0) Gecko/20100101 Firefox/8.0 - beta 5
Mozilla/5.0 (Windows NT 6.1; rv:9.0a2) Gecko/20111025 Firefox/9.0a2 - build id: 20111027042020
Mozilla/5.0 (Windows NT 6.1; rv:10.0a1) Gecko/20111021 Firefox/10.0a1 - build id: 20111028031044

Mozilla/5.0 (Windows NT 5.1; rv:8.0) Gecko/20100101 Firefox/8.0
Mozilla/5.0 (Windows NT 5.1; rv:10.0a1) Gecko/20111027 Firefox/10.0a1
Mozilla/5.0 (Windows NT 5.1; rv:9.0a2) Gecko/20111027 Firefox/9.0a2

Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0a1) Gecko/20111027 Firefox/10.0a1
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0) Gecko/20100101 Firefox/8.0
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:9.0a2) Gecko/20111020 Firefox/9.0a2

Mozilla/5.0 (X11; Linux x86_64; rv:8.0) Gecko/20100101 Firefox/8.0
Mozilla/5.0 (X11; Linux x86_64; rv:10.0a1) Gecko/20111027 Firefox/10.0a1
Mozilla/5.0 (X11; Linux x86_64; rv:9.0a2) Gecko/20111027 Firefox/9.0a2

Verified on Win7 x64 with the above builds.
While SVG image is loading, there is a temporary hang of browser, but it recovers in a few seconds (less than 5).
Status: RESOLVED → VERIFIED
Whiteboard: [verified-aurora] [verified-beta]
(In reply to Daniel Holbert [:dholbert] from comment #9)
> Me too :-/  Per bug 656244 comment 30, that still requires some nontrivial
> imagelib API refactoring.  I need to make sure there's a bug filed on doing
> that...

BTW, I just filed bug 704059 on doing this refactoring, to be sure it's being tracked somewhere. (Not sure if there was already a bug filed on that; if so, mine can be duped to the original.)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: