Open Bug 1702447 Opened 3 years ago Updated 15 days ago

Blob constructor is easy to use incorrectly and results in easy OOM

Categories

(Core :: DOM: File, defect, P3)

defect

Tracking

()

People

(Reporter: padenot, Unassigned)

References

Details

Attachments

(1 file)

222 bytes, application/octet-stream
Details
Attached file testcase.html

STR: Run this (also attached)

<script>
    // 500kB buffer
    var buffer = new ArrayBuffer(500 * 1024);
    console.time("a")
    var a = URL.createObjectURL(new Blob(new Uint8Array(buffer), ["audio/mpeg"]));
    console.timeEnd("a")
</script>

Expected:

  • Some memory is allocated, it's fast

Actual:

  • This allocates and maps about 300MB of memory, and takes about 1s to run on my really fast linux box. On a file/buffer that is more than a few megabytes, it OOMs, and this machines has 64GB of memory.

Profile of the issue:

This is also problematic in Chrome fwiw.

Flags: needinfo?(bugs)

The bulk of the time spent seems to be in SerializeInputStreamAsPipeInternal.

This seem to work correctly in Safari.

Component: DOM: Core & HTML → DOM: File

Do you have a profile with parent process too, and IPC profiling might be useful too.

Flags: needinfo?(bugs) → needinfo?(amarchesini)
Flags: needinfo?(bugs)

I discussed this a bit on #dom on matrix, but this is actually being caused by a footgun. The Blob constructor takes an array of blob segments as a first argument, whereas this example passes in a Uint8Array. WebIDL converts each element of the array individually into a string to be added to the stream, so the behaviour actually ends up being:

["0", "0", "0", ...500k times, "0"]

The overhead in this situation, then, becomes due to the overhead of processing all 500k stream segments independently. WebKit appears to eagerly concatenate sequences of strings together into the same blob part at construction time, meaning that they end up storing a sequence of 500k "0"s, whereas we convert each single-byte string into a stream independently. The largely-inflated overhead is then coming just from every stream being relatively expensive compared to a single byte string.

It might be worth considering producing a warning of some kind if a Uint8Array is directly passed to new Blob as the first argument, depending on how difficult that is to do, and we could consider concatenating adjacent string/ArrayBuffer segments together like WebKit is in order to reduce overhead for very large blob segment sequences.

Summary: URL.createObjectURL is inefficient → URL.createObjectURL is easy to use incorrectly and results in easy OOM

Given comment 6, this bug is not related to URL.createObjectURL, only to the Blob constructor.

Severity: -- → S4
Priority: -- → P3
Summary: URL.createObjectURL is easy to use incorrectly and results in easy OOM → Blob.constructor is easy to use incorrectly and results in easy OOM
Summary: Blob.constructor is easy to use incorrectly and results in easy OOM → Blob constructor is easy to use incorrectly and results in easy OOM

The severity field for this bug is set to S4. However, the following bug duplicate has higher severity:

:hsingh, could you consider increasing the severity of this bug to S3?

For more information, please visit auto_nag documentation.

Flags: needinfo?(hsingh)

I think the real-world impact is still low enough.

Flags: needinfo?(hsingh)
Flags: needinfo?(smaug)
Flags: needinfo?(amarchesini)
You need to log in before you can comment on or make changes to this bug.