Closed
Bug 874065
Opened 11 years ago
Closed 11 years ago
Make memory-report gathering work on an unrooted B2G device
Categories
(Firefox OS Graveyard :: General, defect)
Firefox OS Graveyard
General
Tracking
(blocking-b2g:tef+, firefox22 wontfix, firefox23 wontfix, firefox24 fixed, b2g18 verified, b2g18-v1.0.0 wontfix, b2g18-v1.0.1 verified, b2g-v1.1hd fixed)
People
(Reporter: justin.lebar+bug, Assigned: justin.lebar+bug)
Details
(Whiteboard: [tef-triage])
Attachments
(3 files, 2 obsolete files)
Right now get_about_memory.py only works on rooted devices (it sends a signal to the B2G process, which requires root).
In theory you can pull an about:memory report of an unrooted device by writing to a fifo that's globally-writable. We need to modify the script to do this and check that it actually works.
Assignee | ||
Comment 1•11 years ago
|
||
I'm not going to have time to work much this week (I'm moving), so maybe someone else can do this. The FIFO is /data/local/debug_info_trigger. You should be able to write
"memory report"
"minimize memory report"
"gc log"
to the fifo. See nsMemoryInfoDumper.cpp.
Assignee | ||
Comment 2•11 years ago
|
||
Pointer to Github pull-request
Assignee | ||
Comment 3•11 years ago
|
||
Comment on attachment 755664 [details]
Pointer to Github pull request: https://github.com/mozilla-b2g/B2G/pull/249
I've tested this as best I can, but after we land it we're going to need QA to test this on actual, honest-to-God user builds.
Attachment #755664 -
Flags: review?(dhylands)
Assignee | ||
Updated•11 years ago
|
Assignee: nobody → justin.lebar+bug
Comment 4•11 years ago
|
||
Comment on attachment 755664 [details]
Pointer to Github pull request: https://github.com/mozilla-b2g/B2G/pull/249
Looks good to me.
Attachment #755664 -
Flags: review?(dhylands) → review+
Assignee | ||
Comment 5•11 years ago
|
||
Merged: https://github.com/mozilla-b2g/B2G/commit/6fc34102a752897a73fbbe08a98046b1715e2f44
QA: I haven't been able to get an honest-to-God user build, so I need your help verifying this fix. This is a really important fix to verify because if this doesn't work, we can't get memory reports off a phone in the wild. These memory reports are crucial for understanding a variety of different bugs.
Here's how to do it:
1) Take a user build of a device. It can be a build from last week or earlier, because this change doesn't affect the bits on the phone.
2) On mac or linux, run
$ git clone https://github.com/mozilla-b2g/B2G
or, if you already have the B2G repository cloned update your tree (with git pull or git fetch and git merge).
3) $ cd B2G/tools
4) $ ./get_about_memory.py
This should output something like
> Got 5/5 files.
> Pulled files into about-memory-1.
It might not say 5/5; it may be 3/3 or something. That's fine.
5) Then please zip up the about-memory-N directory that's generated and attach it to this bug.
Keywords: qawanted
Assignee | ||
Updated•11 years ago
|
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
QA Contact: jsmith
Comment 6•11 years ago
|
||
And, more specifically, this particular test requires adb to be enabled, and we'd like to confirm that the adb shell uses a non-root accoumt.
You can tell by doing:
adb shell id
If it reports:
uid=0(root) gid=0(root)
then its a root shell, which is what most of us developers have. We're more interested in a test from a phone which reports something like:
uid=2000(shell) gid=2000(shell) groups=...
jlebar: so I tried it on my leo phone (which comes up with the shell user by default) and I got the following:
2189 >./get_about_memory.py
Got 2/2 files.
Pulled files into about-memory-0.
Remote command /system/bin/toolbox "rm" /data/local/tmp/memory-report-1370139410-437.json.gz /data/local/tmp/memory-report-1370139410-149.json.gz failed with error code 255
rm failed for /data/local/tmp/memory-report-1370139410-437.json.gz, Operation not permitted
Traceback (most recent call last):
File "./get_about_memory.py", line 259, in <module>
get_and_show_dump(args)
File "./get_about_memory.py", line 177, in get_and_show_dump
(merged_reports_path, dmd_files) = get_dumps(args)
File "./get_about_memory.py", line 174, in get_dumps
return utils.run_and_delete_dir_on_exception(do_work, out_dir)
File "/home/work/B2G-leo/tools/include/device_utils.py", line 142, in run_and_delete_dir_on_exception
return fun()
File "./get_about_memory.py", line 160, in do_work
optional_outfiles_prefixes=['dmd-'])
File "/home/work/B2G-leo/tools/include/device_utils.py", line 216, in notify_and_pull_files
_remove_files_from_device(all_outfiles_prefixes, old_files)
File "/home/work/B2G-leo/tools/include/device_utils.py", line 310, in _remove_files_from_device
remote_toolbox_cmd('rm', ' '.join([str(f) for f in files_to_remove]))
File "/home/work/B2G-leo/tools/include/device_utils.py", line 56, in remote_toolbox_cmd
return remote_shell('/system/bin/toolbox "%s" %s' % (cmd, args))
File "/home/work/B2G-leo/tools/include/device_utils.py", line 39, in remote_shell
raise subprocess.CalledProcessError(retcode, cmd, cmd_out)
subprocess.CalledProcessError
ls -l /data/local/tmp reports:
prw-rw-rw- root root 2013-06-01 19:15 fifo_cmd
prw-rw-rw- root root 2013-06-01 19:15 fifo_dat
-rw-r----- root root 24592 2013-06-01 19:16 memory-report-1370139410-149.json.gz
-rw-r----- app_437 app_437 13264 2013-06-01 19:16 memory-report-1370139410-437.json.gz
-rw-rw-rw- root root 13 2013-06-01 19:15 profile_calib_a
-rw-rw-rw- root root 0 2013-06-01 19:14 profile_calib_m
---------- root root 12288 2013-06-01 19:15 sensord.log
So, I think that the memory-report files are having their permissions masked by the umask in effect from /system/bin/b2g.sh (which does a umask 027)
This means that we need to change the permissions on the files after we create them. Incidently, while it said it pulled 2 files, there was no local about-memory-0 directory. This makes sense, since the shell user can't read the files, so the pull would have failed.
I'll clear qawanted until we have a patch that at least works on leo.
Keywords: qawanted
QA Contact: jsmith
Assignee | ||
Comment 7•11 years ago
|
||
Assignee | ||
Comment 8•11 years ago
|
||
Comment on attachment 755758 [details] [diff] [review]
Untested patch (does seem to compile)
Would you mind testing this, Dave? I'm hopeful this is all we need to do.
Attachment #755758 -
Flags: feedback?(dhylands)
Assignee | ||
Comment 9•11 years ago
|
||
I'm going to morph this bug into the issue of making memory reports work on an unrooted device.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Summary: Modify B2G/tools/get_about_memory.py to work on an un-rooted device → Make memory-report gathering work on an unrooted B2G device
Assignee | ||
Comment 10•11 years ago
|
||
This is a critical blocking issue: Without this fix, we will be unable to retrieve memory reports off devices. These memory reports are a key tool for our understanding of a large class of bugs. If we can't make memory reports work, we will have no insight into these bugs, which will compromise our ability to improve new versions of the product.
It's likely that whatever changes we have to make here will be very safe: They should only affect code which is run when dumping memory reports. The worst thing that should be able to happen with a patch here is that we break memory-report dumping. But it's already broken, so that's a no-brainer risk to take.
blocking-b2g: --- → tef?
Comment 11•11 years ago
|
||
Comment on attachment 755758 [details] [diff] [review]
Untested patch (does seem to compile)
Review of attachment 755758 [details] [diff] [review]:
-----------------------------------------------------------------
So this change makes the files retrievable, however they're still not removable.
The issue is that /data/local/tmp is created as a true temporary directory and has the sticky bit set (if you look in /system/bin/b2g.sh you'll see the line:
chmod 1777 $TMPDIR
The 1 is the sticky-bit. It allows the /data/local/tmp directory to be world-writable, but only allows the process which owns the file to remove it.
So,if we were to create a /data/local/tmp/memory-reports directory and give it 0666 permissions and put the memory-report*.gz files in that directory, then the get_about_memory_.py could remove the files within that directory, but would have to leave the directory itself around, which I think is a reasonable compromise.
Attachment #755758 -
Flags: feedback?(dhylands) → feedback+
Comment 12•11 years ago
|
||
I said that only the process which owns the file can remove it (in a directory with the sticky bit set), but I meant only the user which owns the file can remove it.
Assignee | ||
Comment 14•11 years ago
|
||
What if we put it in /data/local instead?
Comment 15•11 years ago
|
||
/data/local/tmp is the only location writable by the child content processes.
shell@android:/ $ ls -ld /data/local
drwxr-x--x root root 2013-06-01 21:15 local
So you still wouldn't be able to remove it. Note that /data/local/tmp gets cleared on a reboot.
Assignee | ||
Comment 16•11 years ago
|
||
Don't I want 777 on the directory, not 666? Otherwise I can list the directory but not read the files inside?
Comment 17•11 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #16)
> Don't I want 777 on the directory, not 666? Otherwise I can list the
> directory but not read the files inside?
Doh - of course :) What was I thinging (never mind - don't answer that).
Comment 18•11 years ago
|
||
Following up in email.
Assignee | ||
Comment 19•11 years ago
|
||
Are you up for reviewing this, Dave?
Attachment #756200 -
Flags: review?(dhylands)
Assignee | ||
Updated•11 years ago
|
Attachment #755758 -
Attachment is obsolete: true
Assignee | ||
Updated•11 years ago
|
Attachment #755664 -
Attachment is obsolete: true
Assignee | ||
Comment 20•11 years ago
|
||
Pointer to Github pull-request
Comment 21•11 years ago
|
||
Comment on attachment 756200 [details] [diff] [review]
Patch, v1
Review of attachment 756200 [details] [diff] [review]:
-----------------------------------------------------------------
I tested this on my leo device with a "shell user" adb shell (as opposed to a root user adb shell) and it all seems to be working fine).
I saw that the files got removed from /data/local/tmp/memory-reports.
firefox was launched with the memory report. I'll attach it for completeness.
I also wound up being slightly confused about the directory on the target being named "memory-reports" and the file on the host being named "memory-reports", but that's pretty minor.
::: xpcom/base/nsMemoryInfoDumper.cpp
@@ +728,5 @@
> + nsAutoCString dirPath;
> + rv = (*aFile)->GetNativePath(dirPath);
> + NS_ENSURE_SUCCESS(rv, rv);
> +
> + while (chmod(dirPath.get(), 0777) == -1 && errno == EINTR) {}
Since we're using ANDROID, we could use:
TEMP_FAILURE_RETRY(chmod(dirPath.get(), 0777));
which wraps up the retry on -1 and EINTR. I'll leave it up to you to decide if you want to change this or not.
TEMP_FAILURE_RETRY can be found in bionic/libc/include/unistd.h (so #include <unistd.h>)
@@ +747,5 @@
> nsAutoCString path;
> rv = file->GetNativePath(path);
> NS_ENSURE_SUCCESS(rv, rv);
>
> + while (chmod(path.get(), 0666) == -1 && errno == EINTR) {}
Same thing here.
Attachment #756200 -
Flags: review?(dhylands) → review+
Comment 22•11 years ago
|
||
Comment 23•11 years ago
|
||
oh yeah - the gecko patch attached assumes that the obsoleted patch is also applied) - just an FYI - not a problem if you're landing yourself.
Assignee | ||
Updated•11 years ago
|
Attachment #756202 -
Flags: review?(dhylands)
Assignee | ||
Comment 24•11 years ago
|
||
I guess I'd rather write portable code, even if it's Android only. The macro doesn't save a lot of space.
Assignee | ||
Comment 25•11 years ago
|
||
Updated•11 years ago
|
Attachment #756202 -
Flags: review?(dhylands) → review+
Comment 26•11 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #24)
> I guess I'd rather write portable code, even if it's Android only. The
> macro doesn't save a lot of space.
The macro is also available from glibc. I think it may not have been available in earlier versions of android, but appears to be available now.
It also appears that we have MOZ_TEMP_FAILURE_RETRY:
https://mxr.mozilla.org/mozilla-central/source/xpcom/glue/FileUtils.h#157
What you've coded is fine. I just wanted to point this out for future reference (using the macro, you at least know it's coded properly).
Assignee | ||
Comment 27•11 years ago
|
||
Oh, if we have a macro in Mozilla code, that's nice.
Although it doesn't check for EAGAIN (which isn't an issue here, but is an issue with some other syscalls).
There's another chmod in this file that needs EINTR treatment. When we regain our sanity, we can convert them all to that macro.
Assignee | ||
Comment 28•11 years ago
|
||
Alex, if I understood correctly, you said in an e-mail that this could be tef+. Would you mind setting the flag?
Comment 29•11 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #27)
> Oh, if we have a macro in Mozilla code, that's nice.
>
> Although it doesn't check for EAGAIN (which isn't an issue here, but is an
> issue with some other syscalls).
>
> There's another chmod in this file that needs EINTR treatment. When we
> regain our sanity, we can convert them all to that macro.
EAGAIN should only really apply to files which are opened non-blocking. I would like to believe that we're not using non-blocking I/O unless we absolutely need to, since non-blocking I/O implies polling.
Using blocking I/O in conjunction with select/poll would be a better way to go.
I'm just speaking in generalities here, but I figured I say my piece:)
Comment 30•11 years ago
|
||
We'll tef+ this for any remaining 1.0.1 downstream partners.
blocking-b2g: tef? → tef+
Comment 31•11 years ago
|
||
Dave, Justin,
When this is uplifted, please add keywords: qawanted, verifyme. QA can work through verifying the fixes, assuming comment 5 is still the correct way to test it. if not, please add new steps to test as well. Thanks.
Comment 32•11 years ago
|
||
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Comment 33•11 years ago
|
||
Took some fixing up to land on b2g18, but it appears to build OK. You still might want to take a look to make sure everything works as intended.
https://hg.mozilla.org/releases/mozilla-b2g18/rev/c952a123b902
https://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/bb73520e9dce
status-b2g18:
--- → fixed
status-b2g18-v1.0.0:
--- → wontfix
status-b2g18-v1.0.1:
--- → fixed
status-b2g-v1.1hd:
--- → affected
status-firefox22:
--- → wontfix
status-firefox23:
--- → wontfix
status-firefox24:
--- → fixed
Comment 34•11 years ago
|
||
Updated•11 years ago
|
QA Contact: jsmith
Comment 35•11 years ago
|
||
Confirmed working using the STR in comment 5 with my Inari 6/4 b2g18 build. I got my memory report successfully from my device and viewed it in the browser.
Comment 36•11 years ago
|
||
Also confirmed working on a partner customized 1.01 Ikura build from 6/4. I got my memory report successfully from my device and viewed it in the browser.
Keywords: verifyme
You need to log in
before you can comment on or make changes to this bug.
Description
•