Closed
Bug 903949
Opened 12 years ago
Closed 12 years ago
about:memory and the thumbnails process are combining to cause hangs and unexpected behaviour
Categories
(Toolkit :: about:memory, defect)
Toolkit
about:memory
Tracking
()
RESOLVED
FIXED
mozilla26
Tracking | Status | |
---|---|---|
firefox24 | --- | unaffected |
firefox25 | + | fixed |
firefox26 | + | fixed |
People
(Reporter: n.nethercote, Assigned: n.nethercote)
References
Details
(Whiteboard: [MemShrink][qa-])
Attachments
(2 files)
1.03 KB,
patch
|
justin.lebar+bug
:
review+
akeybl
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
2.57 KB,
patch
|
justin.lebar+bug
:
review+
akeybl
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
Hughman reported some hangs involving memory reporting. After some investigation, here are my steps to reproduce:
- Start Firefox.
- Open about:memory, hit "measure". Keep it open.
- Delete the images in the thumbnails dir, within the profile temp dir. On my machine it's $HOME?.cache/mozilla/firefox/<profile>/thumbnails.
- Open about:newtab. This triggers a thumbnails process.
- Wait ~60 seconds for the thumbnails process to die.
- about:memory will auto-update, due to the existing inter-process memory reporting machinery, which is triggered by the thumbnails process's death.
This last event is very surprising for the user, and probably undesirable. Hughman was also seeing 7--10 second hangs, much longer than the 2 second pauses he saw after hitting "measure" in about:memory. He's on Windows, I wonder if the IPC is slow.
Another curious thing is that even though the thumbnails process is passing memory reports to the main process, I don't see anything other than "Main Process" in about:memory. Turns out there's a bug in ChildMemoryReporter, and it's always returning the empty string from GetProcess(), instead of the child process name. That's easy to fix.
But for the original problem... the IPC done by about:memory isn't pretty. It basically request child processes to give it some reports, and then measures the main process and displays the results. Then, when a child process reports back, it re-measures the main process and re-displays.
And when the child process dies, it sends a message to the main process saying "all those measurements you had from me are dead, re-measure", more or less. Ugh.
![]() |
Assignee | |
Comment 1•12 years ago
|
||
This fixes the problem of the child process not showing up in about:memory.
Just a dumb defect introduced when I changed the reporters to use
MemoryReporterBase, and not noticed because we didn't have child processes at
the time.
Attachment #788801 -
Flags: review?(justin.lebar+bug)
![]() |
Assignee | |
Comment 2•12 years ago
|
||
Here are measurements of the thumbnails process:
> Content (3744)
> Explicit Allocations
>
> 79.96 MB (100.0%) -- explicit
> ├──45.31 MB (56.67%) -- window-objects/top(http://techcrunch.com/, id=1)
> │ ├──32.55 MB (40.71%) -- active
> │ │ ├──21.54 MB (26.94%) ++ window(http://techcrunch.com/)
> │ │ ├───6.88 MB (08.60%) ++ window(http://www.youtube.com/embed/Gs0JD5KITfM?version=3&rel=1&fs=1&showsearch=0&showinfo=1&iv_load_policy=1&wmode=transparent)
> │ │ ├───2.94 MB (03.68%) -- (8 tiny)
> │ │ │ ├──0.64 MB (00.81%) ++ window(http://static.ak.facebook.com/connect/xd_arbiter.php?version=26#channel=f9b30a982fa908&origin=http%3A%2F%2Ftechcrunch.com&channel_path=%2F%3Ffb_xd_fragment%23xd_sig%3Df59c43a4342efc%26)
> │ │ │ ├──0.64 MB (00.81%) ++ window(https://s-static.ak.facebook.com/connect/xd_arbiter.php?version=26#channel=f9b30a982fa908&origin=http%3A%2F%2Ftechcrunch.com&channel_path=%2F%3Ffb_xd_fragment%23xd_sig%3Df59c43a4342efc%26)
> │ │ │ ├──0.44 MB (00.55%) ++ window(http://ads.tw.adsonar.com/adserving/getAds.jsp?previousPlacementIds=&placementId=1530659&pid=2360767&zw=300&zh=250&ssl=false&url=http%3A//techcrunch.com/&v=5&dct=TechCrunch)
> │ │ │ ├──0.42 MB (00.53%) ++ window(http://techcrunch.com/wp-content/themes/vip/tctechcrunch2/_uac/isocket.html?phpAds_used=&MAX_used=&OA_used=)
> │ │ │ ├──0.42 MB (00.53%) ++ window(about:blank)
> │ │ │ ├──0.15 MB (00.19%) ++ window(http://cdn.tacoda.at.atwola.com/an/qseg.html)
> │ │ │ ├──0.11 MB (00.14%) ++ window(http://rma-api.gravity.com/v1/api/intelligence/w2?sg=fca4fa8af1286d8a77f26033fdeed202&pl=7&ug=&b=13&ad=&sourceUrl=http%3A%2F%2Ftechcrunch.com%2F&frameUrl=http%3A%2F%2Ftechcrunch.com%2F&clientTime=1376287926106&pageViewId%5BwidgetLoaderWindowUrl%5D=http%3A%2F%2Ftechcrunch.com%2F&pageViewId%5BtimeMillis%5D=1376287926105&pageViewId%5Brand%5D=20848183125038966)
> │ │ │ └──0.11 MB (00.14%) ++ window(http://ads.tw.adsonar.com/adserving/createASLId.jsp)
> │ │ └───1.19 MB (01.48%) ++ window(http://techcrunch.com/wp-content/themes/vip/tctechcrunch2/_uac/adpage.html)
So the thumbnails process actually loads the pages in question? Huh.
![]() |
Assignee | |
Comment 3•12 years ago
|
||
SetChildMemoryReporters() triggers about:memory regeneration. It's called from
two places:
- MemoryReportRequestParent::Recv__delete__(), which is triggered by the
receipt of a "child-memory-reporter-request" event.
- ContentParent::ActorDestroy, which is triggered by the death of a child
process.
This patch changes the latter... we don't want to auto-update about:memory in
that case, we just want to delete the previously-registered reports.
Attachment #788806 -
Flags: review?(justin.lebar+bug)
![]() |
Assignee | |
Comment 4•12 years ago
|
||
BTW, I don't have any good ideas on how to test these changes.
Also, if/when we get to serious e10s usage, this hacky child process memory reporting situation is going to need some serious re-working. (For example, if you do "Measure and save" in about:memory when the thumbnails process is running, the thumbnails process doesn't get measured, because the child process feedback only works with on-screen about:memory generation. Ugh.) But I'm not sure what the new design will look like, and these patches are enough to keep things limping along for the moment :/
![]() |
Assignee | |
Comment 5•12 years ago
|
||
> > Content (3744)
And it's a shame we don't get a more informative process name for the thumbnails process. I'm trying to remember how B2G does a better job with this...
Updated•12 years ago
|
Attachment #788801 -
Flags: review?(justin.lebar+bug) → review+
Updated•12 years ago
|
Attachment #788806 -
Flags: review?(justin.lebar+bug) → review+
Comment 6•12 years ago
|
||
> BTW, I don't have any good ideas on how to test these changes.
You mean, how to write an automated test?
It would be good to have an automated test verifying that child-process memory reporters actually work. That shouldn't be too hard; you can create a <xul:browser remote=true> or an <iframe mozbrowser>, pull the memory reports, and verify that you have two processes.
If you want to verify that forcibly killing the child process doesn't cause SetChildMemoryReporters to run, that's a bit more difficult, but we can kill child processes from script. It might be sufficient simply to remove the browser/iframe from the DOM. That doesn't forcibly kill it, but it's the same path here.
But I'm not sure that's worth testing, myself.
I totally agree with everything else you've said here, particularly the bit about these hacky memory reporters.
Comment 7•12 years ago
|
||
> It would be good to have an automated test verifying that child-process memory reporters
> actually work.
Oh right, bug 673323. :)
> And it's a shame we don't get a more informative process name for the thumbnails process.
B2G gets the name from ContentChild::GetProcessName. In the parent process we have ContentParent::FriendlyName, which is similar.
![]() |
Assignee | |
Comment 8•12 years ago
|
||
> If you want to verify that forcibly killing the child process doesn't cause
> SetChildMemoryReporters to run, that's a bit more difficult...
> But I'm not sure that's worth testing, myself.
I agree. I just want to test multi-process reporting, i.e. patch 1. I'll try mozbrowser.
Bug 673323 comment 4 has a suggested design for a better approach to memory reporting with multiple processes.
![]() |
Assignee | |
Comment 9•12 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/780836c77424
https://hg.mozilla.org/integration/mozilla-inbound/rev/911c2e86d6a4
I landed without a test because I was having trouble getting it working and jlebar just boarded a plane. I filed bug 904321 as a follow-up.
I guess I should nominate these for Aurora, since the thumbnails child process is on Aurora.
Updated•12 years ago
|
status-firefox25:
--- → affected
tracking-firefox25:
--- → ?
Comment 10•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/780836c77424
https://hg.mozilla.org/mozilla-central/rev/911c2e86d6a4
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
Updated•12 years ago
|
Comment 11•12 years ago
|
||
Ready for an uplift request with risk evaluation.
![]() |
Assignee | |
Comment 12•12 years ago
|
||
Comment on attachment 788801 [details] [diff] [review]
(part 1) - ChildMemoryReporter should return mProcess in GetProcess() rather than an empty string.
[Approval Request Comment]
Bug caused by (feature/regressing bug #): some refactoring patch some time ago.
User impact if declined: Memory measurements from the thumbnails child process are incorrectly combined with those from the parent process in about:memory
Testing completed (on m-c, etc.): on m-c for a few days.
Risk to taking this patch (and alternatives if risky): none. Code was obviously incorrect; patch is trivial.
String or IDL/UUID changes made by this patch: none.
Attachment #788801 -
Flags: approval-mozilla-aurora?
![]() |
Assignee | |
Comment 13•12 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #12)
> Comment on attachment 788801 [details] [diff] [review]
> (part 1) - ChildMemoryReporter should return mProcess in GetProcess() rather
> than an empty string.
>
> [Approval Request Comment]
> Bug caused by (feature/regressing bug #): some refactoring patch some time
> ago.
I should add: the creation of the thumbnails child process (bug 870100) exposed this latent defect.
![]() |
Assignee | |
Comment 14•12 years ago
|
||
Comment on attachment 788806 [details] [diff] [review]
(part 2) - Don't trigger about:memory re-generation when a child process closes.
[Approval Request Comment]
Bug caused by (feature/regressing bug #): has always been latently present, but exposed by bug 870100.
User impact if declined: when the thumbnails process runs, if you have about:memory open it can refresh itself unexpectedly.
Testing completed (on m-c, etc.): on m-c for a few days.
Risk to taking this patch (and alternatives if risky): low. The patch is simple.
String or IDL/UUID changes made by this patch: none.
Attachment #788806 -
Flags: approval-mozilla-aurora?
Updated•12 years ago
|
Attachment #788801 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Updated•12 years ago
|
Attachment #788806 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 15•12 years ago
|
||
Comment 16•12 years ago
|
||
I have tested this issue on FF 25b2 following STR from Comment 0.
I didn't encounter any hang while testing but I am not sure if the steps I had used are reliable. After I opened about:newtab I can see a memory increse by refreshing about:memory tab, in the explicit memory section which means the thumbnails process are mesured.
Anthony, this is what this patch is expected to do? If it does please mark it as being verified, if not please provide me a way to test this behaviour.
Flags: needinfo?(anthony.s.hughes)
Comment 17•12 years ago
|
||
Mihai, how did you reproduce this issue on the build with the bug? What makes the steps seem unreliable to you?
Flags: needinfo?(anthony.s.hughes) → needinfo?(mihai.morar)
Comment 18•12 years ago
|
||
In reply to Comment 0
> This last event is very surprising for the user, and probably undesirable. Hughman was also seeing
> 7--10 second hangs, much longer than the 2 second pauses he saw after hitting "measure" in
> about:memory. He's on Windows, I wonder if the IPC is slow.
I was able to reproduce a short hang about 1-2 seconds, but not as bad as mentioned in Comment 0.
In reply to Comment 4
> (For example, if you do "Measure and save" in about:memory when the thumbnails process is running,
> the thumbnails process doesn't get measured, because the child process feedback only works with
> on-screen about:memory generation. Ugh.)
After I opened about:newtab I can see a memory increse by refreshing about:memory tab, in the explicit memory section which means the thumbnails process are mesured.
On 25b2 I was not able to reproduce the short hang (1-2sec) mentioned above.
Ioana what I wanted to say with "I am not sure if the steps I had used are reliable" it was that if Nicholas said in Comment 4 that "the thumbnails process doesn't get measured" and my memory usage got increased I don't think I am able to fully reproduce this issue on 24 builds following those STR from Comment 0, that's why I didn't mark this bug as being verified, even if I got no hangs on FF 25b2.
Flags: needinfo?(mihai.morar)
Comment 19•12 years ago
|
||
Given the lack of a reliable reproduction I'm going to mark this as [qa-]. Nicholas, if there is something you are specifically concerned about testing then please let us know.
Keywords: verifyme
Whiteboard: [MemShrink] → [MemShrink][qa-]
You need to log in
before you can comment on or make changes to this bug.
Description
•