Open Bug 887015 Opened 11 years ago Updated 2 years ago

Huge amounts of CPU time spent in memcpy when using XMLHttpRequest.responseText

Categories

(Core :: DOM: Core & HTML, defect)

24 Branch
x86_64
Windows 7
defect

Tracking

()

People

(Reporter: kael, Unassigned)

References

(Blocks 1 open bug, )

Details

(Whiteboard: [games:p?][diamond])

When using XMLHttpRequest to stream large files, .responseText accesses seem to produce tons and tons of reallocations and memcpy operations. In my profiles the memcpy traffic alone accounts for 10-13% of CPU time according to the SPS profile.

The reallocations also create a lot of GC pressure and pauses.

Things that would help here:

Avoid reallocating the responseText string's buffer repeatedly (this has other benefits, in particular reducing the consequences of the responseText leak)

Being able to stream an arraybuffer response (last time I checked this doesn't work?)

Being able to request a string or array buffer representing a subset of the response (instead of getting back a 40MB string representing an entire 40MB file)

Being able to adjust the chunk size the browser uses when notifying you about updates via onreadystatechange.


For reference, this is the library I am using to stream TAR files via XHR:
https://github.com/kevingadd/JSIL/blob/tar/Libraries/multifile.js

Let me know if you need a full test case to test against and I can upload one (it's big, though)
Whiteboard: games:p? → [games:p?]
There's a test case up that demonstrates the problem:
http://hildr.luminance.org/bugs/3/Lumberjack/Lumberjack.html?disableSound
Talk to ack -- he has some tar+IndexedDB stuff.  Also, yeah, XHR will realloc, though so will arraybuffer.  responseText is doubly awful though, because it's ucs2; why use responseText and not an arraybuffer? (you can do chunked arraybuffer)

In the future we also plan to honor content-length and preallocate a buffer if possible.
> Avoid reallocating the responseText string's buffer repeatedly

Not really doable if JS asks for the .responseText repeatedly, because it's sharing the underlying buffer, which therefore cannot be modified in-place....

> Being able to stream an arraybuffer response (last time I checked this doesn't work?)

chunked-arraybuffer should work, though it's Gecko-only last I checked.

> Being able to request a string or array buffer representing a subset of the response

Does using HTTP Range headers not work?  Note that this may not play nice with gzip encoding, though.

> In the future we also plan to honor content-length and preallocate a buffer if possible.

That won't help if the web page polls .responseText on progress, unless the JS GC is very expeditious about dropping those strings; see above.
Blocks: gecko-games
Whiteboard: [games:p?] → [games:p3]
Whiteboard: [games:p3] → [games:p3][diamond]
Whiteboard: [games:p3][diamond] → [games:p?][diamond]
Component: DOM → DOM: Core & HTML
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.