Closed Bug 1353762 Opened 8 years ago Closed 7 years ago

Too much sha-512 crashes firefox

Categories

(Core :: DOM: Security, defect, P1)

52 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox-esr45 --- wontfix
firefox52 --- wontfix
firefox-esr52 --- wontfix
firefox53 --- wontfix
firefox54 --- wontfix
firefox55 --- wontfix
firefox56 --- wontfix
firefox57 --- fixed

People

(Reporter: underskatten, Assigned: ttaubert)

References

Details

(Keywords: crash, Whiteboard: [domsecurity-active])

Crash Data

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.41 Safari/537.36 Steps to reproduce: Have a tab open (tested with a website, about:blank, and the your-tab-crashed-page) Open dev console (F12) (it opens in a new window for me) Paste and run the following code: var limit = 1<<20 var hash_data = new Uint8Array([97, 98, 99]) for (var i = 0; i < limit; i++) { crypto.subtle.digest("sha-512", hash_data) } console.log("done") Actual results: Firefox crashes, sometimes with the your-tab-crashed-page, and sometimes the entire thing with a small window opening for crash-report-submitting. Firefox seems to cope with a higher limit if dev console is open (I haven't tested this thoroughly). Expected results: Firefox shouldn't crash.
Blocks: 1037892
Status: UNCONFIRMED → NEW
Crash Signature: [@ OOM | small ]
Component: Untriaged → DOM: Security
Ever confirmed: true
Flags: needinfo?(ttaubert)
Flags: needinfo?(rlb)
Severity: normal → critical
Keywords: crash
Assignee: nobody → ttaubert
Status: NEW → ASSIGNED
Flags: needinfo?(ttaubert)
Flags: needinfo?(rlb)
Priority: -- → P1
Whiteboard: [domsecurity-active]
Attachment #8863737 - Flags: review?(martin.thomson)
Attachment #8863737 - Flags: review?(dkeeler)
Comment on attachment 8863737 [details] [diff] [review] 0001-Bug-1353762-Check-for-OOM-when-creating-WebCryptoTas.patch Review of attachment 8863737 [details] [diff] [review]: ----------------------------------------------------------------- This won't stop Firefox from aborting when it runs out of memory, but it's probably good from a defensive coding standpoint. r=me with the additional aRv.Throw(...) calls. ::: dom/base/SubtleCrypto.cpp @@ +40,5 @@ > if (aRv.Failed()) { \ > return nullptr; \ > } \ > + RefPtr<WebCryptoTask> task = \ > + WebCryptoTask::Create ## Operation ## Task(__VA_ARGS__); \ This is probably good from a defensive coding standpoint, but from what I can tell, every WebCryptoTask::Create ## Operation ## Task uses operator new. Since we have infallible new, this won't stop the crashes (to be blunt, nothing will - if we oom, we oom). In fact, I don't think there's a bug here (except for not propagating the allocation failure in aRv - see below). The code snippet in comment 0 creates a million concurrent hash promises. If each one takes up 4k, that'll quickly use up the entire 32-bit memory space on x86. @@ +41,5 @@ > return nullptr; \ > } \ > + RefPtr<WebCryptoTask> task = \ > + WebCryptoTask::Create ## Operation ## Task(__VA_ARGS__); \ > + if (!task) { \ I'm pretty sure we need to call aRv.Throw(NS_ERROR_OUT_OF_MEMORY) here so that callers don't go on to dereference this pointer (e.g. here: https://dxr.mozilla.org/mozilla-central/source/obj-x86_64-pc-linux-gnu/dom/bindings/SubtleCryptoBinding.cpp#5357 ). ::: dom/media/webrtc/RTCCertificate.cpp @@ +279,5 @@ > } > RefPtr<WebCryptoTask> task = > new GenerateRTCCertificateTask(global, aGlobal.Context(), > aOptions, usages, expires); > + if (!task) { I imagine we need to aRv.Throw(NS_ERROR_OUT_OF_MEMORY) here too.
Attachment #8863737 - Flags: review?(dkeeler) → review+
Comment on attachment 8863737 [details] [diff] [review] 0001-Bug-1353762-Check-for-OOM-when-creating-WebCryptoTas.patch Review of attachment 8863737 [details] [diff] [review]: ----------------------------------------------------------------- If you throw as :keeler says, then this is fine.
Attachment #8863737 - Flags: review?(martin.thomson) → review+
(In reply to David Keeler [:keeler] (use needinfo?) from comment #3) > In fact, I don't think there's a bug here (except for not propagating the > allocation failure in aRv - see below). The code snippet in comment 0 > creates a million concurrent hash promises. If each one takes up 4k, that'll > quickly use up the entire 32-bit memory space on x86. It works fine in Chrome. Is that because Chrome is 64-bit, and Firefox is 32-bit?
(In reply to underskatten from comment #5) > It works fine in Chrome. Is that because Chrome is 64-bit, and Firefox is > 32-bit? also works fine in Chrome *32-bit*. no hang, no freeze.
Given that the promise can be safely garbage-collected immediately, do we even need to calculate the hashes? Also, we should probably just limit the number of outstanding web-crypto tasks. If we maintain a queue, can we limit the number of tasks we dispatch to it? I'll bet that Chrome is just failing tons of these attempts, but since the output is not measured, it doesn't matter. Even if the return values were stored, we would be justified in failing many of them. That many hashes is only really useful to someone who is farming bitcoin on other people's computers or something equally suspicious.
(In reply to Martin Thomson [:mt:] from comment #7) > I'll bet that Chrome is just failing tons of these attempts, but since the > output is not measured, it doesn't matter. That doesn't seem right. (function() { var limit = 1<<20 // vary last byte for different results var hash_data = new Uint8Array([0, 0, 0, 27]) var hash_pseudo = function(data, x) { crypto.subtle.digest("sha-512", data).then(function(bytes_weird) { var bytes = new Uint8Array(bytes_weird) // 16 zero bits required if (!bytes[0] && !bytes[1]) { console.log("HIT", x) } }) } for (var i = 0; i < limit; i++) { hash_data[0] = i & 255 hash_data[1] = (i >> 8) & 255 hash_data[2] = i >> 16 hash_pseudo(hash_data, i) } })() This more closely resembles what I was trying to do. Running this in Chrome is about 50% slower than the code from comment 0, and it works just fine. It still crashes Firefox, of course. The expected amount of HITs is around (1<<20)/(1<<16) = 16. Running it a few times (replacing the 27) gives results around 16, so Chrome can't be throwing away any significant amount.
Happy to be shown to be wrong. FWIW, that code runs happily in my copy of Firefox. I have a 64-bit machine and tons of RAM, so that's not dispositive of course. I also get a slow script warning, which is unsurprising. With Tim's change, this shouldn't crash the browser directly. However, exhausting memory will likely still happen. Though we make allocation here fallible, chewing up that much memory is almost inevitable. The state that the browser holds at the end of the loop is - at a minimum - the size of the inputs that were provided to it, plus overheads for each. We do crash if we fail to allocate memory at certain points, and when you run close to the limits of the machine, that can happen. I wouldn't be surprised to learn that Firefox has higher overheads here than Chrome. That might account for differences. Change the 20 to a larger number and eventually something has to give.
This is an assigned P1 bug without activity in two weeks. If you intend to continue working on this bug for the current release/iteration/sprint, remove the 'stale-bug' keyword. Otherwise we'll reset the priority of the bug back to '--' on Monday, August 28th.
Keywords: stale-bug
Pushed by ttaubert@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/4d60239251ae Check for OOM when creating WebCryptoTasks r=keeler,mt
Keywords: stale-bug
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Is this worth consideration for backport to 56 or can it ride the 57 train? Also, can we land a test for this?
I don't understand this patch. As comment 3 says, WebCryptoTask::CreateDigestTask never returns null, because all the return statements are of the form |return new something| and operator new will just crash on OOM. So as far as I can tell all this patch did was add some dead code that is never reached, without fixing the bug as originally filed...
Flags: needinfo?(dkeeler)
Like I said in comment 3, when Firefox runs out of memory, it will still crash. There's nothing we can do there in general, although there may be mitigations in this specific area of code like :mt mentioned in comment 7. We could explore those options, but since there will always be ways of making Firefox use more memory than physically available, I'm not convinced it's a high priority. In terms of the patch that was just landed, consider the two sections. For RTCCertificate::GenerateCertificate, since we clearly use infallible operator new, there's no obvious benefit to doing the null check, so that was probably incorrect to add. In the case of the SUBTLECRYPTO_METHOD_BODY macro, however, it's not as immediately clear. While we do currently always use operator new (I just checked), adding the null check will prevent against a future mistake where a `WebCryptoTask::Create ## Operation ## Task` implementation erroneously returns nullptr, for example, so I think that's useful.
Flags: needinfo?(dkeeler)
I agree that if we're really out of memory we will just crash. Do we have a followup bug for reducing the amount of allocation we do here (e.g. by queueing things up, assuming that doesn't need just as much allocation) to avoid the problem? The fact that this works in other UAs but not Firefox means that sites can start writing code like this and having it crash us only would be pretty bad. The code isn't _trying_ to crash anything, and doing something 1e20 times is not totally unreasonable: it's a large number but not completely ridiculous...
While this particular example can crash Firefox under memory-limited circumstances (for instance, it works (slowly) on my dev machine with a ton of ram, but if I `ulimit -v 4194304` or so (which should be 4GB, iiuc), it OOMs rather quickly), it's not the case that this doesn't also affect other browsers (as :mt posited in comment 9). With `ulimit -v 2097152` in Chrome, the code snippet in comment 8 crashes the tab. The issue appears to be that Firefox has much higher resource overheads. We may be able to fix this by queueing the promises created, but even if we do that here, we'll still have similar problems in pretty much any other promise-based API (unless they also rate-limit). I filed bug 1394990 to investigate this.
(In reply to Ryan VanderMeulen [:RyanVM] from comment #13) > Is this worth consideration for backport to 56 or can it ride the 57 train? We should just let it ride. > Also, can we land a test for this? I don't think we can test behavior in OOM situations. And as David said, we'd crash somewhere else and not here, so it would be hard to verify anyway. (In reply to David Keeler [:keeler] (use needinfo?) from comment #15) > In terms of the patch that was just landed, consider the two sections. For > RTCCertificate::GenerateCertificate, since we clearly use infallible > operator new, there's no obvious benefit to doing the null check, so that > was probably incorrect to add. Right... I wasn't actually aware that infallible new is the default. I'll put up a patch to revert that.
Flags: needinfo?(ttaubert)
Comment on attachment 8902654 [details] [diff] [review] 0001-Bug-1353762-Revert-RTCCertificate-changes-and-remove.patch Review of attachment 8902654 [details] [diff] [review]: ----------------------------------------------------------------- LGTM. It occurs to me we might also change the other null-check to set the error NS_ERROR_INVALID_POINTER or something that indicates it's not actually about OOM but a programming mistake if we encounter a nullptr there.
Attachment #8902654 - Flags: review?(dkeeler) → review+
(In reply to David Keeler [:keeler] (use needinfo?) from comment #20) > LGTM. It occurs to me we might also change the other null-check to set the > error NS_ERROR_INVALID_POINTER or something that indicates it's not actually > about OOM but a programming mistake if we encounter a nullptr there. Good idea, I'll use NS_ERROR_NULL_POINTER, we use it in similar places where encounter an argument that we don't expect to be null.
Pushed by ttaubert@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/87c41124f7c1 Revert RTCCertificate changes and remove unnecessary null-check r=keeler
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: