Closed Bug 1529551 Opened 5 years ago Closed 5 years ago

omni.ja files are read using main thread I/O for every content process

Categories

(Core :: Networking: JAR, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla67
Performance Impact high
Tracking Status
firefox67 --- fixed

People

(Reporter: florian, Assigned: glandium)

References

Details

(Keywords: perf:pageload, Whiteboard: [fxperf][necko-triaged])

Attachments

(1 file)

https://perfht.ml/2DY0rVz shows that we read the omni.ja file (actually, 2 omni.ja files due to bug 1376994) for every content process (the profile shows it in the parent process, at the beginning of the track of "web content" process, and at the beginning of the track of the "privileged content" process).

If I understand correctly, the omni.ja file is memory mapped when we first open it, so that it can be easily accessed later. If we could share this memory read-only between processes, we would only need to open omni.ja in the parent process, which would help with content process startup time (as the omni.ja main thread I/O is a significant part of it) and reduce per-content process memory overhead. Eric, does this seem doable?

[fxperf] in the whiteboard because this delays showing about:home, and [qf] because slow content process startup is in the way of page load.

Flags: needinfo?(erahm)
Whiteboard: [fxperf][qf] → [fxperf][qf:p1:pageload]

The reads come from readahead, which will be addressed in bug 1529558. That won't solve the problem on Windows 7, but content processes don't need to readahead, since the parent process has already done that. So a simple fix is to readahead only in the parent process.

Assignee: nobody → mh+mozilla
Component: General → Networking: JAR
Flags: needinfo?(erahm)

We already share the mapping for omni.ja across all processes. I suppose we could try to share the indexes too. I'm not sure how much memory or computational overhead that would save, but given the number of files in those archives, probably a lot. But someone should do some measurements before anyone spends time working on it.

That said, I'm not sure what we're hoping to achieve with the ReadAhead call. We already map those files with aggressive readahead flags, so the kernel should already do this. Even then, the madvise() we do on unix seems harmless enough, but I can't imagine how the synchronous read operation on Windows is meant to help anything...

As a side note, it looks like most of the cost of this operation in the content processes is from the IO interposer, at about 20ms per ReadAhead call, which seems pretty extravagant. The IO itself should be more or less free as long as the headers are still in the page cache from the last time the parent process or another content process parsed them.

We're not readahead'ing the headers, we're readahead'ing zip content. Which there is enough of that other IO can interpose if you don't readahead, which on spinning disks make things slow.

(In reply to Mike Hommey [:glandium] from comment #5)

We're not readahead'ing the headers, we're readahead'ing zip content. Which there is enough of that other IO can interpose if you don't readahead, which on spinning disks make things slow.

On systems with spinning disks, I would be surprised if there isn't enough memory pressure that we'd wind up re-reading things, anyway.

But I'm still not convinced that a sync read-ahead on the main thread makes sense. Like I said, we already enable aggressive read-ahead at least for the ordered segment of omni.ja, and probably for the whole thing as far as I recall, so if our assumptions that we're going to read the whole ordered segment holds, the kernel should already handle this for us without blocking the main thread.

If that's not enough, I could see the argument for doing a read-ahead on a background thread where we can't just do an madvise, but doing a sync read-ahead on the main thread just seems like a bit of a dodgy gamble.

(All of this, of course, only applies to the parent process. The content process uses a completely different set of files during startup than the parent process, so our ordered read assumptions clearly do not hold there at all.)

(In reply to Kris Maglione [:kmag] from comment #6)

On systems with spinning disks, I would be surprised if there isn't enough memory pressure that we'd wind up re-reading things, anyway.

The interesting question, though is... how much do content processes actually use from omni.ja that is not going to be in the page cache. I'm not even sure they use enough from omni.ja for readahead to be useful, even if there's nothing in the page cache.

If that's not enough, I could see the argument for doing a read-ahead on a background thread where we can't just do an madvise, but doing a sync read-ahead on the main thread just seems like a bit of a dodgy gamble.

Which is why there's bug 1529558. That still leaves the main thread I/O on Windows 7, and the effect of doing it vs. not doing it vs. doing it in a background thread should be checked on actual slow Windows 7 machines (in another bug).

(In reply to Kris Maglione [:kmag] from comment #3)

That said, I'm not sure what we're hoping to achieve with the ReadAhead call. We already map those files with aggressive readahead flags, so the kernel should already do this. Even then, the madvise() we do on unix seems harmless enough, but I can't imagine how the synchronous read operation on Windows is meant to help anything...

Way back when this was implemented, there was data to back up the decision (this was before my time, but I became familiar with what was done here due to my work on omnijar perf and as the only libjar peer left standing). I expect any decision to reverse the use of ReadAhead to also be backed by data.

P1 given Mike is already working on this

Priority: -- → P1
Whiteboard: [fxperf][qf:p1:pageload] → [fxperf][qf:p1:pageload][necko-triaged]
Pushed by aklotz@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2e92db777acf
Only readahead jars in the parent process. r=aklotz
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67
No longer blocks: memshrink-content
Depends on: 1530818
No longer depends on: 1530818
Blocks: 1529894
Performance Impact: --- → P1
Keywords: perf:pageload
Whiteboard: [fxperf][qf:p1:pageload][necko-triaged] → [fxperf][necko-triaged]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: