Add memory statistics to crashes on Linux
Categories
(Toolkit :: Crash Reporting, enhancement)
Tracking
()
Tracking | Status | |
---|---|---|
firefox74 | --- | fixed |
People
(Reporter: gsvelto, Assigned: Yoric)
References
Details
Attachments
(4 files, 1 obsolete file)
We currently write out memory statistics upon a crash only on Windows. This bug is for adding Linux support. We'll need to parse /proc/vmstat
, /proc/meminfo
or both to gather the required information. Documentation on both files is available in Linux proc manpage.
Note that the contents of both files is kernel-dependent so the parsing code will need to be sufficiently robust to handle that.
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 1•6 years ago
|
||
(currently installing a Linux VM for testing this)
Reporter | ||
Comment 2•6 years ago
|
||
Thanks David, note that on Linux we have this type of gunk to avoid using libc functions:
It's not pretty, in fact it's downright ugly but there's no way around it for now.
Assignee | ||
Comment 3•6 years ago
|
||
Thanks! That is ugly indeed.
Submitting a first draft for feedback, I haven't had the opportunity to test it on Linux yet, my VM crashes on me during install – and I still haven't received my Linux/Windows machine.
Assignee | ||
Comment 4•6 years ago
|
||
Depends on D55111
Assignee | ||
Comment 5•6 years ago
|
||
Ok, let's try and do this correctly this time :)
Assignee | ||
Comment 6•6 years ago
|
||
Depends on D55457
Comment 7•6 years ago
|
||
Assignee | ||
Comment 8•6 years ago
|
||
Oops, added missing questions/responses.
Assignee | ||
Updated•6 years ago
|
Depends on bug 1563403 <-- was that a typo?
Assignee | ||
Comment 10•6 years ago
|
||
Sorry, that should have been bug 1601872 (which was caused by a regression in bug 1563403, so I'm not entirely off :) ).
Comment 11•6 years ago
|
||
Error while running ./mach build
ERROR: Rust compiler 1.36.0 is too old.
0:03.68 To compile Rust language sources please install at least
0:03.68 version 1.37.0 of the 'rustc' toolchain and make sure it is
0:03.68 first in your path.
0:03.68 You can verify this by typing 'rustc --version'.
0:03.68 If you have the 'rustup' tool installed you can upgrade
0:03.68 to the latest release by typing 'rustup update'. The
0:03.68 installer is available from https://rustup.rs/
0:03.72 *** Fix above errors and then restart with
0:03.72 "./mach build"
0:03.72 client.mk:111: recipe for target 'configure' failed
0:03.72 make: *** [configure] Error 1
I tried upgrading it but does not work
I tried installing rustup but says
info: downloading installer
error: it looks like you have an existing installation of Rust at:
error: /usr/bin
error: rustup cannot be installed alongside Rust. Please uninstall first
error: if this is what you want, restart the installation with `-y'
error: cannot install while Rust is installed
Comment 12•6 years ago
|
||
Comment 13•6 years ago
|
||
Comment 14•6 years ago
|
||
Backed out 2 changesets (Bug 1587722) for causing Linux browser-chrome failures CLOSED TREE
Push with failures:https://treeherder.mozilla.org/#/jobs?repo=autoland&selectedJob=280451653&resultStatus=testfailed%2Cbusted%2Cexception&revision=c8e7456abc5b870a81f038e88069e9a82246f300
Failure logs: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=280448679&repo=autoland&lineNumber=17778
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=280448689&repo=autoland&lineNumber=16628
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=280451646&repo=autoland&lineNumber=9841
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=280448671&repo=autoland&lineNumber=18319
Backout: https://hg.mozilla.org/integration/autoland/rev/b98b174fd7521b0cc1364c6fdf4e48c339b7361d
Assignee | ||
Comment 15•6 years ago
•
|
||
So, for some reason, my patch seems to remove the dumpID
under Linux (or at least BrowserTestUtils
won't find it).
I'm digging to understand what's going on.
Assignee | ||
Comment 16•6 years ago
|
||
I can't reproduce locally, but I can 100% reproduce on try.
According to my debug-by-try attempts, the following line kills the crashreporter:
int fd = sys_open("/proc/meminfo", O_RDONLY, /* chmod */ 0);
Except it crashes only in mochitests (works nicely on xpcshell) and so far only on try.
Gabriele, do you know if there could be some sandbox restriction at work here?
Reporter | ||
Comment 17•6 years ago
|
||
(In reply to David Teller [:Yoric] (please use "needinfo") from comment #16)
I can't reproduce locally, but I can 100% reproduce on try.
According to my debug-by-try attempts, the following line kills the crashreporter:
int fd = sys_open("/proc/meminfo", O_RDONLY, /* chmod */ 0);
Except it crashes only in mochitests (works nicely on xpcshell) and so far only on try.
Gabriele, do you know if there could be some sandbox restriction at work here?
Oh yes, that might be the problem. This is happening in the exception handler so if it's a content process that's crashing it's running in the content process context which would be sandboxed. Fortunately there should be a relatively simple fix for this. Right now we're writing out the memory information annotations in the content process exception handler:
But there's no need for that, these annotations are global values so the main process could read them and add them to the crashed process' annotations when we add all the other ones here:
This runs in the context of the main process so it wouldn't run afoul of the sandbox.
One small issue with this is that the WriteMemoryStatus()
function was designed to write to a file and in this case we wouldn't need to do that because we would have access to the crash annotations table directly so some refactoring might be needed.
Assignee | ||
Comment 18•6 years ago
|
||
(In reply to Gabriele Svelto [:gsvelto] from comment #17)
But there's no need for that, these annotations are global values so the main process could read them and add them to the crashed process' annotations when we add all the other ones here:
Won't we get the memory information after cleanup instead of the memory information before cleanup?
Reporter | ||
Comment 19•6 years ago
•
|
||
(In reply to David Teller [:Yoric] (please use "needinfo") from comment #18)
Won't we get the memory information after cleanup instead of the memory information before cleanup?
IIRC we don't actually kill the content process until we're done with crash reporting. The ReadExceptionTimeAnnotations()
function is called from within Breakpad's child-crash callback and I'm fairly sure that the process doesn't get killed until after it returns because I've seen cases where we hit a second exception in the same content process (but on a different thread) while we were processing the first one in that function.
So in theory the memory scenario should be pretty much the same. The only difference I can think of is on Windows the AvailableVirtualMemory
information is process-specific; all the other fields should be global measurements.
Assignee | ||
Comment 20•6 years ago
|
||
Ok, I have a prototype. It works on bc tests but oranges test_oom_annotations
, though. I still need to find where to hook it so that it works for xpcshell tests.
Assignee | ||
Comment 21•6 years ago
|
||
Gabriele, I can't find any reason for it to fail on test_oom_annotations
. The codepaths you gave me above cover both content processes and parent processes, right?
Reporter | ||
Comment 22•6 years ago
|
||
(In reply to David Teller [:Yoric] (please use "needinfo") from comment #21)
Gabriele, I can't find any reason for it to fail on
test_oom_annotations
. The codepaths you gave me above cover both content processes and parent processes, right?
Yes, there's only those two. Can you show me how it's failing?
Assignee | ||
Comment 23•6 years ago
|
||
It's the X6 here. I have annotated AnnotateMemoryStatus
to print YORIC>
and some stuff whenever we enter/exit the function. The printouts show up in bc1 but not in X6.
Reporter | ||
Comment 24•6 years ago
|
||
I'm looking into this, sorry if it took a while but I'm swamped with reviews this week.
Reporter | ||
Comment 25•6 years ago
|
||
I ran this locally and it was running just fine so I was utterly stumped until I noticed that the AvailablePageFile
annotation is written conditionally depending on the presence of the CommitLimit
and Committed_AS
fields in /proc/meminfo
. As it turns out those are optional in Linux so it's likely that our automation machines aren't using overcommit accounting and thus those values aren't populated.
You'll have to adjust the test, maybe making it so that it's OK if AvailablePageFile
is present but we don't fail if it isn't.
Assignee | ||
Comment 26•6 years ago
|
||
Well, reading the logs, I conclude that the memory statistics collection code doesn't seem to run at all. I'll double-check that I've made the change you requested, but here's the same test without AvailablePageFile
: this time, AvailablePhysicalMemory
isn't present, either.
Reporter | ||
Comment 27•6 years ago
|
||
(In reply to David Teller [:Yoric] (please use "needinfo") from comment #26)
Well, reading the logs, I conclude that the memory statistics collection code doesn't seem to run at all. I'll double-check that I've made the change you requested, but here's the same test without
AvailablePageFile
: this time,AvailablePhysicalMemory
isn't present, either.
Ouch, seems like this is a hard nut to crack. I'll double-check too and see what happens.
Assignee | ||
Comment 28•6 years ago
|
||
So, I think the problem is that function do_crash()
reads the subprocess dump without going through the main process. If I replace this with do_content_crash()
, it seems to work nicely.
Assignee | ||
Comment 29•6 years ago
|
||
Depends on D55933
Comment 30•6 years ago
|
||
Comment 31•6 years ago
|
||
Backed out 3 changesets (bug 1587722) for causing xpcshell failures on test_crashreporter_crash.js CLOSED TREE
Backout revision https://hg.mozilla.org/integration/autoland/rev/3008becd0975317f750085be15704ef30630801d
Failure log https://treeherder.mozilla.org/logviewer.html#?job_id=282940417&repo=autoland
David could you please take a look?
Comment 33•6 years ago
|
||
Comment 34•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/c8f1f5a0fe57
https://hg.mozilla.org/mozilla-central/rev/00dfa64114e6
https://hg.mozilla.org/mozilla-central/rev/262a7dd60d12
Description
•