Open Bug 1555311 Opened 3 years ago Updated 2 years ago

JAR logging doesn't log anything for child processes

Categories

(Firefox Build System :: General, defect)

defect
Not set
normal

Tracking

(firefox69 affected)

Tracking Status
firefox69 --- affected

People

(Reporter: Gijs, Unassigned)

References

Details

(Keywords: in-triage)

Attachments

(1 file)

https://searchfox.org/mozilla-central/rev/7556a400affa9eb99e522d2d17c40689fa23a729/modules/libjar/nsZipArchive.cpp#338-342

conditions on XRE_IsParentProcess().

We currently only use this for PGO repacking (I think?) and PGO uses non-e10s (which I'm trying to change in bug 1196094). I noticed that child processes don't log anything, and I expect changing that might be tricky. I also don't know what the optimal state here is - should things be interleaved more or less in the order in which the various processes need them, or should the child processes have their own list which we pack separately (and in that case, what do we do with files that get used from both) ?

I'm not sure this is a blocker for switching PGO over, as the only regression I've seen flagged up from bug 1196094 and the unintentional fix to it a while back in bug 1548941 is in raptor-speedometer, which should be related to actual PGO data rather than jar log ordering.

Glandium, looks like you've worked on this recently, any ideas on the relative priority of this issue and/or pitfalls and/or what our goal should be?

Component: Networking: JAR → General
Flags: needinfo?(mh+mozilla)
Product: Core → Firefox Build System

I think this is probably what we want. At this point, I can't think of anything the child process should even need to load from a jar at startup. All of the scripts we load at startup come from the ScriptPreloader now (as long as they're loaded early enough, anyway). If there's anything else we wind up loading in the child process at startup, we should probably just stop doing that.

(In reply to Kris Maglione [:kmag] from comment #2)

I think this is probably what we want. At this point, I can't think of anything the child process should even need to load from a jar at startup. All of the scripts we load at startup come from the ScriptPreloader now (as long as they're loaded early enough, anyway).

I'm a little confused. I had a diff of what this file looks like between turning e10s on/off for our PGO profiling run (I can re-create it if you're interested), and things like framescripts and actor jsms disappear when turning e10s off, presumably because they get loaded in the parent process without e10s (so they appear in the jar log for ordering) but I'm 99% sure still need to be loaded? The scriptpreloader is constructed in the first run after loads from omni.ja, right? So this to me almost sounds like "we don't really care about jar ordering for scripts because after the first run of a profile it shouldn't matter anymore". Or am I missing something?

Flags: needinfo?(kmaglione+bmo)

If we do the jar logging on first run, then, yeah, I'd expect to see frame scripts and JSMs from the child. But doing it on first run seems like the Wrong Thing. That means we're optimizing the read-ahead data segment for the unusual case at the expense of the usual case.

In the usual case, we load most of our startup scripts and XUL files from the ScriptPreloader or the StartupCache, and don't touch the actual unparsed data in omni.ja. Which means we're doing an entirely unnecessary synchronous, main thread read-ahead for loads of data that we don't actually use.

Flags: needinfo?(kmaglione+bmo)

That said, it might make sense to have a separate ordered data segment for files that are accessed at startup only on first run, put it after the normal ordered data section, and pre-read it only when starting without populated caches...

See bug 1529551.

(In reply to Kris Maglione [:kmag] from comment #5)

That said, it might make sense to have a separate ordered data segment for files that are accessed at startup only on first run, put it after the normal ordered data section, and pre-read it only when starting without populated caches...

That's something I've suggested in one of the recent bugs related to jar readahead (possibly in some phabricator comment).

Flags: needinfo?(mh+mozilla)

(In reply to Kris Maglione [:kmag] from comment #4)

If we do the jar logging on first run, then, yeah, I'd expect to see frame scripts and JSMs from the child. But doing it on first run seems like the Wrong Thing. That means we're optimizing the read-ahead data segment for the unusual case at the expense of the usual case.

We do it on the second run, see https://searchfox.org/mozilla-central/rev/89235894292ac4ab4bd7dac44bc60c2e051905f4/build/pgo/profileserver.py#104-135 .

So the outcomes here are still a bit surprising to me. Should we be concerned that e.g. the *-content.js framescripts and some child actors have all disappeared because they only get loaded in the children (but, perhaps more, that all the activity stream stuff and what it entails (notably all the download jsms) has made an appearance, not that it gets logged but that it gets loaded at all?).

To get this diff, btw, I pulled an en-US.log file out of a profdata archive on a linux pgo 'run' job before/after the change to use e10s for pgo, diff'd them, then grep'd added/removed lines into 2 separate files, used 'sort' on both, and then diff'd those two files. So the diff is no reflection of ordering, only of what used to be there and now isn't (and vice versa).

Flags: needinfo?(kmaglione+bmo)
Flags: needinfo?(florian)

All of the *-content.js framescripts should be loaded from the preloader cache, and they definitely are for me:

1: chrome://global/content/process-content.js
Key: chrome://global/content/process-content.js
Offset: 0
Size: 9,252
Processes: Parent|Web|Extension|Privileged
...
13: chrome://global/content/browser-content.js
Key: chrome://global/content/browser-content.js
Offset: 451,386
Size: 8,467
Processes: Web|Extension|Privileged
...
16: chrome://global/content/browser-child.js
Key: chrome://global/content/browser-child.js
Offset: 484,867
Size: 6,010
Processes: Web|Extension|Privileged
...
18: chrome://browser/content/tab-content.js
Key: chrome://browser/content/tab-content.js
Offset: 499,277
Size: 8,697
Processes: Web|Privileged

19: chrome://browser/content/content.js
Key: chrome://browser/content/content.js
Offset: 507,974
Size: 7,527
Processes: Web|Privileged
...
24: chrome://global/content/content-HybridContentTelemetry.js
Key: chrome://global/content/content-HybridContentTelemetry.js
Offset: 550,095
Size: 14,940
Processes: Web|Privileged

25: chrome://browser/content/content-sessionStore.js
Key: chrome://browser/content/content-sessionStore.js
Offset: 565,035
Size: 1,243
Processes: Web|Privileged

I'm not sure why they would appear in the jar log at all with or without e10s enabled.

I also have no idea why the activity stream stuff would start showing up after switching it on. Maybe just because it winds up loading faster with e10s enabled?

Flags: needinfo?(kmaglione+bmo)

It does seem plausible that the first run doesn't stay open long enough for all of those things to wind up in the caches, though, which would mean we would still have to load them from source on the second run.

Oh, and the first run only loads 'data:text/html,<script>Quitter.quit()</script>', which of course means that the Activity Stream homepage won't be loaded, though with luck the pre-loaded new tab page may?

I'm not sure what you are asking me with the needinfo.

After reading the comments, I'm wondering if we should add a profiler marker whenever we load from omni.ja during startup a file that we think should be loaded from a cache instead. But I'm pretty sure that's not the answer you expected from me.

Flags: needinfo?(florian)
See Also: → 1559878
You need to log in before you can comment on or make changes to this bug.