Closed Bug 1655536 Opened 5 years ago Closed 5 years ago

Don't wait for memory reports from crashed processes

Categories

(Toolkit :: about:memory, defect, P2)

defect

Tracking

()

RESOLVED FIXED
81 Branch
Fission Milestone M6b
Tracking Status
firefox81 --- fixed

People

(Reporter: mccr8, Assigned: mccr8)

References

Details

Attachments

(4 files)

I've done some investigating of bug 1641328, by writing some local test cases, and my initial theory was sort of right, though there's a bit of nuance.

The basic scenario is that we start a memory report, and then one of the processes that we're waiting on for a report goes away. I wrote a test case for this that works by injecting a memory reporter into one content process that crashes immediately when it runs.

However, that alone is not enough to trigger a hang. In that basic scenario, the content process crashing causes the ContentParent to go away, which causes its MemoryReportRequestHost to go away, so the reporter manager doesn't wait for it to finish, and there is no hang.

However, it is trivial to write chrome JS in the parent process that holds alive the content parent. For instance: let contentParents = ChromeUtils.getAllDOMProcesses();. Adding this to my test case causes the hang. Nika pointed out that JSProcessActors can hold alive ContentParents for longer, so this could come up in real code.

The next patch converts the memory reporting architecture to use the "returns"
feature of IPDL, and mozilla::ipc::RejectCallback does not have a return
type, so this patch removes the return value.

FinishReportingCallback::Callback() needs to remain an XPCOM method
that returns NS_OK because it is called from JS during testing.

This patch uses IPDL's return feature to ensure that the memory
reporter manager won't wait for a report from a child process
that has already exited.

This fixes a memory reporter hang that can happen if a child process
exits during a memory report, when the parent half of the actor is
being held alive. (If the parent half of the actor is not being held
alive, then mMemoryReportRequest will be naturally cleared when it
goes away.)

This was happening frequently on Windows Fission AWSY because that test
does a minimize memory right before it attempts to get a memory report,
and the preallocated content process exits when it sees a message to
minimize memory.

I have a test, but I think it is too flaky to land, so I want to go ahead and try to get the fix landed and hopefully I'll figure out a way to make the test landable.

Status: NEW → ASSIGNED
Fission Milestone: --- → M6b
Pushed by amccreight@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/b5f4f6bfcb1e part 1 - Drop the return value from MemoryReportRequestClient::FinishCallback. r=froydnj https://hg.mozilla.org/integration/autoland/rev/a3a193abd46c part 2 - Don't wait for memory reports from child processes that no longer exist. r=froydnj https://hg.mozilla.org/integration/autoland/rev/7c989e472531 part 3 - Re-enable Fission AWSY on Windows. r=froydnj
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → 81 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: