Support synchronous reading at an offset in IOUtils, workers only
Categories
(Toolkit Graveyard :: OS.File, enhancement)
Tracking
(firefox96 fixed)
Tracking | Status | |
---|---|---|
firefox96 | --- | fixed |
People
(Reporter: mstange, Assigned: mstange)
References
Details
Attachments
(1 file)
When capturing profiles for local Firefox builds, we use symbolication-worker.js
to dump symbols from the build.
This worker currently uses OS.File for file access.
It needs to read parts of a file into an existing Uint8Array, synchronously.
This synchronous requirement cannot be changed; the library we use for symbolication only supports synchronous reads. This code is not running on the main thread, so there are no performance concerns.
File opening can be asynchronous, but after the asynchronous "setup" is done, all reads need to happen synchronously.
Furthermore, it really needs to do partial reads. We cannot just read the entire file upfront and then synchronously copy parts of it out, because one of the files we're dealing with here is libxul, which is frequently around 2GB big (depending on the platform).
The current OS.File solution is rather inefficient: https://share.firefox.dev/2W93lnB
- Starting the OS.File worker is slow because of the JS module imports, which ping-pong to the main thread. Switching to IOUtils would solve this.
- For every read, OS.File creates a new buffer. This causes two problems:
- It requires a copy from that buffer into the existing Uint8Array (here: wasm memory).
- It creates garbage which then needs to be collected.
- It uses ctypes. The profile above spends 8.7% of the
readBytesInto
call inWorkerPrivate::SetGCTimerMode
which gets called when entering and exiting ctypes calls.
An efficient solution would have the following properties:
- All reads should use the same file handle, so that the file doesn't need to be opened for every read.
- There should be minimal copies.
- The API should support reading into an existing destination buffer.
It would be nice to add the necessary APIs to IOUtils so that the OS.File dependency can be dropped from symbolication.
Assignee | ||
Comment 1•3 years ago
•
|
||
I did a brief experiment with mmap and it was very effective: I added a method IOUtils::ReadMmap
which mapped the file into memory (read-only), and then wrapped a Uint8Array
around the mmapped memory. Then symbolication-worker.js
could copy pieces of that Uint8Array
into wasm memory. This did the minimal number of copies, and it only read the parts of file for which the array elements were accessed, simply through the magic of mmap.
However, this solution had three drawbacks:
- Closing the file was tied to GC timing - the file was closed when the
Uint8Array
was GC'ed. - Having JavaScript array access cause synchronous IO is a bit weird.
- Writing to an array element caused an instant crash! The pages were mapped as read-only, so attempting to write to them is a segfault.
Assignee | ||
Comment 2•3 years ago
•
|
||
I'd like to propose the following API:
[Exposed=Worker]
partial namespace IOUtils {
Promise<SyncReadFile> openFileForSyncReading(DOMString path);
};
[ChromeOnly, Exposed=Worker]
interface SyncReadFile {
// The file size, in bytes.
readonly attribute unsigned long size;
// Synchronously read `dest.length` bytes at offset `offset` into `dest`.
// Throws if the file has been closed already or if the read would be out-of-bounds.
[Throws]
void readBytesInto(Uint8Array dest, unsigned long offset);
// Close the file. Subsequent calls to readBytesInto will throw.
void close();
};
(Edit: in the attached patch, I made openFileForSyncReading
synchronous, rather than returning a promise. And I changed "unsigned long" to "long long" because nsFileStream uses int64_t for the file size.)
Assignee | ||
Comment 3•3 years ago
|
||
How do people feel about the proposed API?
Assignee | ||
Updated•3 years ago
|
Comment 4•3 years ago
|
||
I'm going to let Emma chime in here. :-)
The only thing I did notice from a very brief look, shouldn't the IOUtils
partial interface thingummy also be ChromeOnly
?
Assignee | ||
Comment 5•3 years ago
|
||
(In reply to :Gijs (he/him) from comment #4)
The only thing I did notice from a very brief look, shouldn't the
IOUtils
partial interface thingummy also beChromeOnly
?
I'm not sure - I was modelling this after other Chrome-only partial interface
declarations which also don't have ChromeOnly
, such as this one for ChromeUtils
and this one for IOUtils
.
Comment 6•3 years ago
|
||
Seems reasonable to me, and nice to see it was effective.
I CC'ed Barret who was responsible for a lot of the IOUtils implementation
Assignee | ||
Comment 7•3 years ago
|
||
This exposes synchronous file reading to workers. It's intended to be used
by profiler symbolication.
The API only supports reading into an existing Uint8Array. This avoids
creating garbage, and minimizes copies.
It also keeps the file open so that it doesn't need to be reopened for
each read.
The implementation uses nsFileStream.
I first tried an implementation which used mmap + memcpy rather than read,
but it didn't work for files larger than 2GiB due to limitations in NSPR's
mmap support, and the profiler needs to read >2GiB files.
Specifically, the profiler sometimes needs to read
/System/Library/dyld/dyld_shared_cache_arm64e , which is 2.2GiB big on
macOS 11.5.
Updated•3 years ago
|
Assignee | ||
Comment 8•3 years ago
|
||
Emma, let me know if you'd like to review the patch, too. I requested review from kmag because of the new WebIDL interface. I cargo-culted the nsWrapperCache and cycle collection stuff from GamepadHapticActuator.
Pushed by mstange@themasta.com: https://hg.mozilla.org/integration/autoland/rev/2778256ad2bb Add IOUtils::OpenFileForSyncReading. r=dom-worker-reviewers,asuth
Comment 10•3 years ago
|
||
bugherder |
Updated•11 months ago
|
Description
•