Open Bug 1864374 Opened 1 year ago Updated 7 months ago

Opening an SVG with certain large numbers in attributes crashes/freezes the entire PC

Categories

(Core :: Graphics: WebRender, defect)

Firefox 119
defect

Tracking

()

REOPENED
122 Branch
Tracking Status
firefox122 --- affected

People

(Reporter: marcelo, Assigned: jfkthame, NeedInfo)

References

(Blocks 1 open bug)

Details

(Keywords: regressionwindow-wanted)

Attachments

(3 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/119.0

Steps to reproduce:

Open an SVG file like the one attached in Firefox.

Actual results:

Opening this file crashes/freezes my entire PC (even the cursor stops moving! Forcing me to do a hard restart!)

Expected results:

Do not freeze the computer.

And consider that with an SVG configured this way, a malicious website could crash the victim's computer (not just the browser). It shouldn't be possible.

I'll leave it to the experts to investigate, but Firefox is an app and it can not crash entire PC without OS permission. I'd recommend to start with checking video driver updates, Windows updates and run a stress test on the hardware. I had it on my PC for a while, strange reboots and weird behavior for almost a year until I realized it's not apps or OS, but my video card causing the malfunction when used more than usual.

Group: firefox-core-security
Component: Untriaged → SVG
Product: Firefox → Core

I agree that it's a possibility and I'll investigate my PC, but it's really the only situation I've seen this happen to date.

In any case, the browser goes to 100% before the PC crashes, so it's worth analyzing the case.

Thanks in advance.

we should check whether this is a regression e.g. from webrender.

Seems like 2020-06-01 nightly without webrender deals better with it.

The crash itself seems to be caused by an OOM:

[unhandlable oom] Failed to mmap, likely no more mappings available /home/dshin/mozilla-unified/memory/build/mozjemalloc.cpp : 1723
QA Whiteboard: [qa-regression-triage]

Massive blob or so? Nical, do you know why this svg triggers such bad resource usage?

Flags: needinfo?(nical.bugzilla)

Given the complete crash & that other browsers don't, S2 seems suitable.
On the other hand (And perhaps an argument for S3), the SVG does contain some ridiculously huge numbers height is 9223372036854774784px, paths' y coordinates are also huge..

Severity: -- → S2
Status: UNCONFIRMED → NEW
Ever confirmed: true

Additional information:
Numbers like 9223372036854774784px from this SVG are very close to the upper limit of 64-bit signed integers in some programming languages, such as C++ and PHP (which ranges from -9223372036854775808 to 9223372036854775807).
I discovered this bug by chance in a unit test.

FWIW, neither Chrome nor Safari appear to really deal with the huge height of the SVG; if I open it with Chrome and look at it in the inspector, it claims to have a height of 16777200px (approx 2^24), and in Safari it's 33554428px (approx 2^25).

When I try to load this SVG, we end up here with a requests vector of over 500000 jobs for the individual 256px tiles we're attempting to render. Clearly that's more than we can cope with.

(Truncating the requests vector here by doing something like

         let requests: Vec<Job> = requests
             .iter()
+            .take(16384) // Don't try to handle too many tiles at once
             .map(|params| {

means we no longer hang/crash, though of course it means painting would be incomplete.)

Seems like we should either

a) give up at some much earlier point so that we're not trying to render with webrender at all (assuming there's a fallback path we can use).

or

b) scale down the request so that it requires fewer tiles and then scale up the result. I guess we'd have a low resolution output but it would be complete.

Both of these seem like ideas worth exploring.

Another possible mitigation I wondered about would be to increase the tile size being used when the blob image is very large, so that the total number of tiles doesn't grow so huge. Experimentally, I found that if I allow the tiles to be increased to 2K pixels each (instead of the 256px being used by default), the testcase no longer completely kills my browser.

Tentatively reclassifying to WebRender since it seems like the cleanest fix is over there, but feel free to further-reclassify if this is unfixable on that side for some reason.

Severity: S2 → --
Component: SVG → Graphics: WebRender

On my MBPro, at least, this makes us no longer completely hang/crash on the testcase;
it still hangs for a little while but recovers ok, and the tab can be closed or the
browser quit normally.

(The SVG still doesn't render successfully, presumably because of coordinate
value overflows all over the place, but that's the case in other browsers as well;
neither Chrome nor Safari seems to be able to render it.)

Here's my strawman patch that attempts to mitigate this by increasing tile sizes; in my local build, at least, this means the testcase no longer brings the browser completely to its knees (we do still have a significant hang, but recover and can cleanly close the tab, etc). I'll leave it for the gfx/wr folk to decide if something like this is worthwhile.

Doing something (per comment 10) higher up in the chain to prevent us ever attempting such a massive blob image may be a better solution, though.

(In reply to Jonathan Kew [:jfkthame] from comment #13)

Created attachment 9364534 [details]
Bug 1864374 - Try to mitigate cases where blob image rasterization results in a huge number of tiles, by increasing the tile size.

On my MBPro, at least, this makes us no longer completely hang/crash on the testcase;
it still hangs for a little while but recovers ok, and the tab can be closed or the
browser quit normally.

(The SVG still doesn't render successfully, presumably because of coordinate
value overflows all over the place, but that's the case in other browsers as well;
neither Chrome nor Safari seems to be able to render it.)

Don't worry about the rendering itself. With these coordinates there, nothing will appear. The only issue is the crash.

Severity: -- → S2

Glenn, WDYT about doing something like attachment 9364534 [details] to at least somewhat mitigate this, pending any more fundamental fixes elsewhere?

Flags: needinfo?(gwatson)
Assignee: nobody → jfkthame
Attachment #9364534 - Attachment description: Bug 1864374 - Try to mitigate cases where blob image rasterization results in a huge number of tiles, by increasing the tile size. → Bug 1864374 - Try to mitigate cases where blob image rasterization results in a huge number of tiles, by increasing the tile size. r=#gfx-reviewers
Status: NEW → ASSIGNED

Ideally we should be clipping the SVG to the display port but for some reason we fail to do that, or maybe that's something we only attempt when the SVG is inline. I think that clipping is the more fundamental fix and that it should happen somewhere in WebRenderCommandsBuilder.cpp or the code that calls into it. We could also set a hard limit for the size of an SVG image and clip out content beyond that, as was said here, with this kind of content, strictly rendering correctly is not a necessity.

I think than Jonathan's mitigation is fine although I suspect we'll still be likely to OOM somewhere else.

Flags: needinfo?(nical.bugzilla)

Tim would you have cycles to look into this?

Flags: needinfo?(tnikkel)
See Also: → 1812136

I'm fine with the mitigation you've proposed, it certainly seems better than OOM.

Having said that, I don't know a huge amount about the blob code in Gecko, and ideally we'd fix it before it gets to WR there - which is probably more of a Nical or JeffM question.

Flags: needinfo?(gwatson)

I agree, we should ideally fix this earlier so that we never make such a huge request to WR. This just aims to make things a bit more resilient in case an unreasonably large request does get through (whether from svg or any other source).

Pushed by jkew@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/b4a067f5b89d Try to mitigate cases where blob image rasterization results in a huge number of tiles, by increasing the tile size. r=gfx-reviewers,nical
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 122 Branch

(In reply to Nicolas Silva [:nical] from comment #18)

Tim would you have cycles to look into this?

Not at the moment, but hopefully in the future.

Bug 1849774 also had a very large svg that I wanted to investigate why we weren't clipping what we were drawing to something closer to what is visually seen.

See Also: → 1849774

(In reply to Serban Stanca [:SerbanS] from comment #22)

https://hg.mozilla.org/mozilla-central/rev/b4a067f5b89d

👏👏👏

Let's keep this open. The mitigation that landed makes it a bit less likely to crash but the bulk of the issue is still there.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

I was able to reproduce the issue on Win10x64 using Firefox build 119.0, meaning that opening in a second tab the link from description I get a white page and CPU/Usage goes ~100%. In latest Nightly 122.0a1(2023-12-13) the issue still reproduces and instead of getting a white browser page I get a OS black screen and on about:crashes I see a lot of unsubmitted logs (crash ID: d99cd305-4053-4be9-a0c6-e10c30231213).
I could not find a regression using mozregression but it still reproduces back to 85.0a1.

Regressions: 1870417

As suggested, I updated the video card driver, but the PC (not just Firefox) keeps crashing when opening this SVG.

The stress test went normally as well.

It must be something in my Windows. I'll update it too and check the settings. Anyway, Firefox itself has crashed on every PC I've tested this on.

After the patch that landed here, is the remaining badness here still S2 level?

(In reply to Marcelo Otowicz from comment #27)

It must be something in my Windows. I'll update it too and check the settings. Anyway, Firefox itself has crashed on every PC I've tested this on.

If you haven't already, would you mind testing Firefox Nightly to confirm that the issue is fixed there?

(In reply to Timothy Nikkel (:tnikkel) from comment #28)

After the patch that landed here, is the remaining badness here still S2 level?

Probably not, if it mitigates the crash as far as we know. Though really, maybe we should just close this as an S2, and spin off a followup lower-severity bug to track the remaining work that we'd like to do. It's always a bit messy tracking-wise to have multiple behavior-impacting patches land on a single bug across multiple months/releases; hard to figure out which release is affected and to what extent.

If the dimensions are sufficiently vast, it's definitely still possible to completely swamp the renderer with thousands of tiles, even after enlarging the tile size; the example in bug 1870417 does that. (And trying to increase the tile size further doesn't really help, we're liably to just end up with an OOM-crash.)

So while it doesn't seem like any real/useful content would run into the issue, it's still a potential DoS vector. Not sure whether that justifies an S2 rating, though.

Generally I don't think DoS rises to the S2 level. If real content hits the problem where content is legitimately trying to do some real applicable thing on the web and not specifically aiming to cripple a browser, then it could rise to S2 level.

I'm going to downgrade this to S3 based on the last few comments. If we feel this is still S2 I'm happy for us to reconsider.

Severity: S2 → S3

I tested with Firefox Nightly today, as requested, and the problem continues to occur.

Well, I didn't understand the controversy about being S2 or S3 (I don't know what defines each level and how important this label is), but I will speak as a common user:

If it were a script or a giant HTML page crashing the browser (and the PC, in some cases) I wouldn't consider it that serious, as most web applications prevent users from editing the HTML of a post directly or sending gigantic text, but a 23 Kb SVG image that can be sent without any difficulty via chat or helpdesk, embedded in a forum, an email or a blog comment, can even force technical support companies to abandon Firefox and use another browser to be able to assist customers if they start receiving such SVGs as spam. Likewise, sites that allow images to be uploaded to the public would need to start blocking SVGs with these characteristics to avoid losing Firefox users.

Whether it's S2 or S3 I can't say. But it is a serious bug.

Well the POC image in the attachment seems quite verbose; interestingly, for me all it takes to "take down the browser tab" (or more precisely freeze its process in infinite loop of RAM allocation and freeing, rendering Firefox unresponsive for most of that time) on my machine in today's Nightly and stable can be under 120 bytes MREs:

data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg" viewBox="-1e1,-1e1,2e1,2e1"><circle r="1e1"/></svg>

or

data:text/html,<svg viewBox="-1e1 -1e1 2e1 2e1"><circle r="1e1"/></svg>

with all e1s changed to e10 in the SVG sample and e11 in the HTML sample.

I guess for more powerful machines the exponent could be bumped even higher. I've tried that on machine with 15 GiB RAM, and got allocated all of it almost instantly upon first load. I am no not a security expert, but I guess having JS-less way to exhaust RAM quickly could contribute to some nasty attack vector. When left unattended, eventually led to crash with a signature 1849774 that already links in here.


Tangentially related: I've searched specs for some stance / advice regarding optimal recommended precision limits for implementers (browsers) and found nothing concrete. Recently I made an observation that the moment one leaves a narrow "sweet spot" of (surprisingly large: +/-1e4) viewport size, each browser start to fail in spectacularly individual way in some occasions. For those interested, namely dasharray in humongous / super tiny viewboxes are often blatantly off, and often produce even more surprising mismatch of what is rendered to what is interactive (testing playground: https://myfonj.github.io/tst/SVG-decimal-precision.html -- it should not crash your browser without warning).

Shouldn't the standard body give at least some baseline guidelines for this, and ideally give limits beyond which would be browsers "free to fail" or refuse to even bother trying?

This situation with disagreeing implementations (and even crashes) seems super unfortunate to me.

I don't know how much RAM is needed to display Michal Čaplygin's example without trouble, but 128GB isn't enough :)

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: