Closed Bug 1103958 Opened 10 years ago Closed 9 years ago

Dump Gecko AppInfos and about:memory with log shake

Categories

(Firefox OS Graveyard :: Runtime, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(firefox35 wontfix, firefox36 wontfix, firefox37 wontfix, firefox38 fixed, firefox-esr31 unaffected, b2g-v2.2 fixed, b2g-master fixed)

RESOLVED FIXED
2.2 S5 (6feb)
Tracking Status
firefox35 --- wontfix
firefox36 --- wontfix
firefox37 --- wontfix
firefox38 --- fixed
firefox-esr31 --- unaffected
b2g-v2.2 --- fixed
b2g-master --- fixed

People

(Reporter: gerard-majax, Assigned: gerard-majax)

References

Details

(Whiteboard: [systemsfe])

Attachments

(2 files, 11 obsolete files)

40.04 KB, text/x-log
Details
23.99 KB, patch
gerard-majax
: review+
Details | Diff | Splinter Review
We should dump Gecko/Gaia infos: branch, commit, etc.
And about:memory.
https://tbpl.mozilla.org/?tree=Try&rev=47b994bfa292

Should be green: added some tests, no about:memory nor gaia yet (gaia just exposes informations in a file in the app, not sure if we can retrieve it easily).
Attachment #8538432 - Attachment is obsolete: true
Given how gaia commit infos are stored (a file in the settings app ...), it feels like too much hassle for a few use. Getting application.ini brings already enough value: we know the gaia commit matching a b2g build from b2g's config gaia.json file.
(In reply to Alexandre LISSY :gerard-majax from comment #5)
> https://tbpl.mozilla.org/?tree=Try&rev=effc1b7159bb

Green, we just lack about:memory, which we can do:
 - use nsMemoryInfoDumper::DumpMemoryInfoToTempDir()
 - collect PID
 - collect files from filesystem
Updating title since getting Gaia commit is too painful for a very few gain.
Summary: Dump Gecko/Gaia informations with log shake → Dump Gecko AppInfos and about:memory with log shake
With application.ini and about:memory: https://tbpl.mozilla.org/?tree=Try&rev=7b23a936190c
(In reply to Alexandre LISSY :gerard-majax from comment #13)
> Created attachment 8539257 [details] [diff] [review]
> Dumping Gecko AppInfos and about:memory in LogShake r=gwagner

Now with more tests: checking that we do really create files

Try: https://tbpl.mozilla.org/?tree=Try&rev=86f4f125779c
(In reply to Alexandre LISSY :gerard-majax from comment #15)
> Created attachment 8539277 [details] [diff] [review]
> Dumping Gecko AppInfos and about:memory in LogShake r=gwagner

Fixed a couple of things: https://tbpl.mozilla.org/?tree=Try&rev=e66e65e18fef
Attachment #8538519 - Attachment is obsolete: true
Attachment #8539215 - Attachment is obsolete: true
Attachment #8539257 - Attachment is obsolete: true
(In reply to Alexandre LISSY :gerard-majax from comment #17)
> Created attachment 8539365 [details] [diff] [review]
> Dumping Gecko AppInfos and about:memory in LogShake r=gwagner

Disabled tests using sdcard, since it looks like not working: https://tbpl.mozilla.org/php/getParsedLog.php?id=54985752&tree=Try#error4

Try: https://tbpl.mozilla.org/?tree=Try&rev=2c4600c71304
Attachment #8539277 - Attachment is obsolete: true
(In reply to Alexandre LISSY :gerard-majax from comment #19)
> Created attachment 8539411 [details] [diff] [review]
> Dumping Gecko AppInfos and about:memory in LogShake r=gwagner

Re-enabled sdcard tests with some code to fake the volume. Passes locally.
Try: https://tbpl.mozilla.org/?tree=Try&rev=6566f94c408b
Attachment #8539365 - Attachment is obsolete: true
(In reply to Alexandre LISSY :gerard-majax from comment #20)
> (In reply to Alexandre LISSY :gerard-majax from comment #19)
> > Created attachment 8539411 [details] [diff] [review]
> > Dumping Gecko AppInfos and about:memory in LogShake r=gwagner
> 
> Re-enabled sdcard tests with some code to fake the volume. Passes locally.
> Try: https://tbpl.mozilla.org/?tree=Try&rev=6566f94c408b

Green :)
Attachment #8539411 - Flags: review?(anygregor)
Attachment #8539800 - Attachment is obsolete: true
Attachment #8539411 - Attachment is obsolete: true
Attachment #8539411 - Flags: review?(anygregor)
Attachment #8539800 - Attachment is obsolete: false
Attachment #8539800 - Flags: review?(anygregor)
Blocks: 1019816
No longer depends on: 1019816
Attachment #8539800 - Flags: review?(anygregor) → review+
Green try in comment 21
Keywords: checkin-needed
Needs rebasing.
Keywords: checkin-needed
Attachment #8539800 - Attachment is obsolete: true
Comment on attachment 8545916 [details] [diff] [review]
Dumping Gecko AppInfos and about:memory in LogShake r=gwagner

Carrying r+ from :gwagner. Rebased on top of master: there was a small conflict just because of changing single quotes to double quotes, because it is code we touched also in bug 1093108.

Johan, can you check that it does the job as expected? Thanks!
Flags: needinfo?(jlorenzo)
Attachment #8545916 - Flags: review+
Keywords: checkin-needed
I tested the patch on top of:
Gaia-Rev        c27d6c87ae9c2d0869de7ed202668200e1ebe9b7
Gecko-Rev       58c53ce72ef6ba4bfd56c934f0b3d0a0bd1ce8ef
Build-ID        20150108202542
Version         37.0a1
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  eng.jlorenzo.20141211.144125
FW-Date         Thu Dec 11 14:41:42 CET 2014
Bootloader      L1TC000118D0


application.ini and about-memory are exported and readable. about-memory can be open in today's Mozilla Nightly and the content seems coherent.

During the tests, I came across 3 crashes and only one went to Socorro[1]. I opened bug 1119739 for the 2 others.
In order to reproduce the first crash, I tried to spam the logshake while powering on and off the screen. I managed to lock once the log write, where I wasn't able to logshake again until I rebooted.
Also once (over a hundred times), the about-memory (only) wasn't saved on the internal memory. It came just before the first crash, so I suspect the crash happened before the info was written on the disk.

I didn't file any bug for these 2 last issues, because I wasn't able to reproduce them a second time.

[1] https://crash-stats.mozilla.com/report/index/effe204b-a1b3-45b2-b4a7-5d8432150108
Flags: needinfo?(jlorenzo)
sorry had to backout this changes for test failures like https://treeherder.mozilla.org/logviewer.html#?job_id=1128929&repo=b2g-inbound
Flags: needinfo?(lissyx+mozillians)
It looks like it's only failing on emulator debug builds :(
Flags: needinfo?(lissyx+mozillians)
Seems to be the same crash as in bug 1113594.

Nicolas, Gregor suggested that you may have an idea regarding the crash that happens on b2g emulator debug, and that looks like bug 1113594 :)
Flags: needinfo?(nicolas.b.pierron)
(In reply to Alexandre LISSY :gerard-majax from comment #32)
> Seems to be the same crash as in bug 1113594.

I do not know anything about this code/crash, but it seems to me that the mDomain LoadInfo of the Worker has a null nsCString.  Are we trying to report memory of a worker which has not yet completed its initialization?
Flags: needinfo?(nicolas.b.pierron)
Ok, found a couple of things to improve,
green try when disabling test_logshake_gonk.js for gonk debug: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6412cc489df1
orange try when enabling test_logshake_gonk.js for gonk debug: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b0b6053b13ff
Nicholas, do you mind having a look at this crash, visible in X1 ? This is on the try at https://treeherder.mozilla.org/#/jobs?repo=try&revision=b0b6053b13ff

It looks exactly like bug 1113594.
Flags: needinfo?(n.nethercote)
> it seems to me that the mDomain LoadInfo of the Worker has a null nsCString.

But Loadinfo::mDomain is an |nsCString| and WorkerPrivateParent::Domain() returns a |const nsCString&|. So null pointers shouldn't be possible, and nsCString should auto-construct to a valid, empty string.

bent, any ideas?
Flags: needinfo?(n.nethercote) → needinfo?(bent.mozilla)
Trying to reproduce this locally, but it's a real pain: running tests gets broken because of plugged in devices, or other emulators running. So basically, since this morning I have been able to run those 3 times, and this resulted always in green, making me unable to reproduce locally and investigate.
Attached file xpcshell.log
> for i in $(seq 10); do ./mach xpcshell-test b2g/components/test/unit/ 2>&1 | tee -a xpcshell.log; done;

Not reproduced, over ten runs, there are ten success.
(In reply to Nicolas B. Pierron [:nbp] from comment #34)
> it seems to me that the mDomain LoadInfo of the Worker has a null nsCString.

I don't know what that means... It's a class instance so there's no pointer involved...
Flags: needinfo?(bent.mozilla)
Oh, looks like bug 1010064 broke the worker memory reporter.
njn, |mWorkerPrivate| can be null in CollectReports() here: 

  http://hg.mozilla.org/mozilla-central/diff/258916327d96/dom/workers/WorkerPrivate.cpp#l1.76

The only time you're guaranteed to have a valid |mWorkerPrivate| is in the constructor, where the old code was building |mRtPath|:

  http://hg.mozilla.org/mozilla-central/diff/258916327d96/dom/workers/WorkerPrivate.cpp#l1.41
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #42)
> njn, |mWorkerPrivate| can be null in CollectReports() here: 

Well spotted! I've spun off bug 1122322.
Depends on: 1122322
(In reply to Alexandre LISSY :gerard-majax from comment #45)
> Created attachment 8550304 [details] [diff] [review]
> Dumping Gecko AppInfos and about:memory in LogShake. r=gwagner

I could get X, X1 and X2 green with this: https://treeherder.mozilla.org/#/jobs?repo=try&revision=701e26684314

10 retriggers for each to make sure it's not intermittent.
Attachment #8545916 - Attachment is obsolete: true
Comment on attachment 8550304 [details] [diff] [review]
Dumping Gecko AppInfos and about:memory in LogShake. r=gwagner

Carrying r+ from :gwagner
Attachment #8550304 - Flags: review+
Retriggers failed, still timeout in most of the cases for B2G Emulator debug :(. Looks like we hang during capturing logs ...
Ok, from https://treeherder.mozilla.org/#/jobs?repo=try&revision=1a79cd894195 and http://ftp.mozilla.org/pub/mozilla.org/b2g/try-builds/alissy@mozilla.com-1a79cd894195/try-emulator-debug/try_ubuntu64_vm-b2g-emulator-debug_test-xpcshell-1-bm118-tests1-linux64-build590.txt.gz, I do see |debug("Creating " + logsDir);|, but I don't see |debug("Created: " + logsDir);| which is printed just after the call |OS.File.makeDir(logsDir, {ignoreExisting: false})| returns.

Yoric, would you know how I could check if something bad happens/times out in OS.File.makeDir ?
Flags: needinfo?(dteller)
Theoretically, this should not be possible, so this sounds like a bug in OS.File. You can activate logging by setting preference toolkit.osfile.log to `true`.

As a side-note, the control flow in the code would be much more easier to follow with Task.jsm.
Flags: needinfo?(dteller)
Here is the end of the log that concerns OS.File module, from https://treeherder.mozilla.org/#/jobs?repo=try&revision=bbbc09e9b6ff:

08:01:13     INFO -  01-18 15:56:52.401 I/Gecko   ( 1260): OS Controller Posting message {"fun":"makeDir","args":[{"string":"/data/local/tmp/sdcard/logs"},{"from":"/data/local/tmp/sdcard"}],"id":1}
08:01:13     INFO -  01-18 15:56:52.711 I/Gecko   ( 1260): OS Controller Message posted
08:01:13     INFO -  01-18 15:56:52.741 I/Gecko   ( 1260): OS Controller Expecting reply
08:01:13     INFO -  01-18 15:56:58.781 I/Gecko   ( 1260): OS Controller Received message from worker {"id":1,"timeStamps":{"entered":1421596614328,"loaded":1421596618681}}
08:01:13     INFO -  01-18 15:56:58.981 I/Gecko   ( 1260): LogShake.jsm: First OS.File.makeDir done
08:01:13     INFO -  01-18 15:56:59.001 I/Gecko   ( 1260): LogShake.jsm: Creating /data/local/tmp/sdcard/logs/2015-01-18-15-56-48
08:01:13     INFO -  01-18 15:56:59.061 I/Gecko   ( 1260): OS Controller Posting message {"fun":"remove","args":[{"string":"/data/local/tmp/logshake-about_memory-1260.json.gz"}],"id":2}
08:01:13     INFO -  01-18 15:56:59.101 I/Gecko   ( 1260): OS Controller Message posted
08:01:13     INFO -  01-18 15:56:59.111 I/Gecko   ( 1260): OS Controller Expecting reply
08:01:13     INFO -  01-18 15:56:59.192 I/Gecko   ( 1260): OS Controller Received message from worker {"id":2}
08:01:13     INFO -  01-18 15:56:59.521 I/Gecko   ( 1260): OS Controller Posting message {"fun":"Meta_shutdown","args":[true],"id":3}
08:01:13     INFO -  01-18 15:56:59.591 I/Gecko   ( 1260): OS Controller Message posted
08:01:13     INFO -  01-18 15:56:59.591 I/Gecko   ( 1260): OS Controller Expecting reply
08:01:13     INFO -  01-18 15:56:59.621 I/Gecko   ( 1260): OS Controller Posting message {"fun":"makeDir","args":[{"string":"/data/local/tmp/sdcard/logs/2015-01-18-15-56-48"},{"ignoreExisting":false}],"id":4}
08:01:13     INFO -  01-18 15:56:59.661 I/Gecko   ( 1260): OS Controller Message posted
08:01:13     INFO -  01-18 15:56:59.671 I/Gecko   ( 1260): OS Controller Expecting reply
08:01:13     INFO -  01-18 15:56:59.741 I/Gecko   ( 1260): OS Controller Received message from worker {"ok":{"openedFiles":[],"openedDirectoryIterators":[],"killed":true},"id":3}
Flags: needinfo?(dteller)
Oh, great, looks like we still haven't resolved the race condition introduced by bug 927560.
For some reason, we manage to post `makeDir` to the worker thread while we are shutting it down (between the instant message `Meta_shutdown` is sent and the instant its response is received). This should have been fixed by (iirc) bug 961317. How urgent is your bug?
Flags: needinfo?(dteller)
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #52)
> Oh, great, looks like we still haven't resolved the race condition
> introduced by bug 927560.
> For some reason, we manage to post `makeDir` to the worker thread while we
> are shutting it down (between the instant message `Meta_shutdown` is sent
> and the instant its response is received). This should have been fixed by
> (iirc) bug 961317. How urgent is your bug?

It's not blocking, but it's something we would really like asap. If it was not for B2G Emulator Debug failures, this could have landed three weeks ago :)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3e8e7144dc22

With b2g emulator debug xpcshell test disabled. I think we will have to land it like this, and file a follow up to fix this once yoric is back.
Attachment #8550304 - Attachment is obsolete: true
Comment on attachment 8551980 [details] [diff] [review]
Dumping Gecko AppInfos and about:memory in LogShake. r=gwagner

Carrying r+
Attachment #8551980 - Flags: review+
Blocks: 1125989
Comment on attachment 8554757 [details] [diff] [review]
Dumping Gecko AppInfos and about:memory in LogShake. r=gwagner

Carrying r+, rebased on master, disabled b2g emulator debug after agreeing about this with gregor. A follow up has been filed as bug 1125989.
Attachment #8554757 - Flags: review+
Attachment #8551980 - Attachment is obsolete: true
See try in comment 54.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/876419971e2f
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → 2.2 S5 (6feb)
Whiteboard: [systemsfe]
blocking-b2g: --- → 2.2?
Comment on attachment 8554757 [details] [diff] [review]
Dumping Gecko AppInfos and about:memory in LogShake. r=gwagner

NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 
User impact if declined: developpers and QA miss about:memory report when dumping logs by shaking the device, which can help diagnose issues
Testing completed: on device, added xpcshell tests, landed on master since a couple of days
Risk to taking this patch (and alternatives if risky): low, it's for a developper/QA feature, hidden behind a settings in the developper panel
String or UUID changes made by this patch: none
Flags: needinfo?(jlorenzo)
Flags: needinfo?(anygregor)
Attachment #8554757 - Flags: approval-mozilla-b2g37?
Tested in comment 29. From a QA perspective, having this feature in 2.2 would facilitate the analyze of memory usage and hence, help us to validate the performance bugs we committed in 2.2
Flags: needinfo?(jlorenzo)
Not blocking but we should uplift for improved debug functionality.
blocking-b2g: 2.2? → ---
Flags: needinfo?(anygregor)
Attachment #8554757 - Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
Blocks: 1137659
Depends on: 1188999
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: