Closed Bug 874065 Opened 11 years ago Closed 11 years ago

Make memory-report gathering work on an unrooted B2G device

Categories

(Firefox OS Graveyard :: General, defect)

defect
Not set
normal

Tracking

(blocking-b2g:tef+, firefox22 wontfix, firefox23 wontfix, firefox24 fixed, b2g18 verified, b2g18-v1.0.0 wontfix, b2g18-v1.0.1 verified, b2g-v1.1hd fixed)

RESOLVED FIXED
blocking-b2g tef+
Tracking Status
firefox22 --- wontfix
firefox23 --- wontfix
firefox24 --- fixed
b2g18 --- verified
b2g18-v1.0.0 --- wontfix
b2g18-v1.0.1 --- verified
b2g-v1.1hd --- fixed

People

(Reporter: justin.lebar+bug, Assigned: justin.lebar+bug)

Details

(Whiteboard: [tef-triage])

Attachments

(3 files, 2 obsolete files)

Right now get_about_memory.py only works on rooted devices (it sends a signal to the B2G process, which requires root). In theory you can pull an about:memory report of an unrooted device by writing to a fifo that's globally-writable. We need to modify the script to do this and check that it actually works.
I'm not going to have time to work much this week (I'm moving), so maybe someone else can do this. The FIFO is /data/local/debug_info_trigger. You should be able to write "memory report" "minimize memory report" "gc log" to the fifo. See nsMemoryInfoDumper.cpp.
Pointer to Github pull-request
Comment on attachment 755664 [details] Pointer to Github pull request: https://github.com/mozilla-b2g/B2G/pull/249 I've tested this as best I can, but after we land it we're going to need QA to test this on actual, honest-to-God user builds.
Attachment #755664 - Flags: review?(dhylands)
Assignee: nobody → justin.lebar+bug
Comment on attachment 755664 [details] Pointer to Github pull request: https://github.com/mozilla-b2g/B2G/pull/249 Looks good to me.
Attachment #755664 - Flags: review?(dhylands) → review+
Merged: https://github.com/mozilla-b2g/B2G/commit/6fc34102a752897a73fbbe08a98046b1715e2f44 QA: I haven't been able to get an honest-to-God user build, so I need your help verifying this fix. This is a really important fix to verify because if this doesn't work, we can't get memory reports off a phone in the wild. These memory reports are crucial for understanding a variety of different bugs. Here's how to do it: 1) Take a user build of a device. It can be a build from last week or earlier, because this change doesn't affect the bits on the phone. 2) On mac or linux, run $ git clone https://github.com/mozilla-b2g/B2G or, if you already have the B2G repository cloned update your tree (with git pull or git fetch and git merge). 3) $ cd B2G/tools 4) $ ./get_about_memory.py This should output something like > Got 5/5 files. > Pulled files into about-memory-1. It might not say 5/5; it may be 3/3 or something. That's fine. 5) Then please zip up the about-memory-N directory that's generated and attach it to this bug.
Keywords: qawanted
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
QA Contact: jsmith
And, more specifically, this particular test requires adb to be enabled, and we'd like to confirm that the adb shell uses a non-root accoumt. You can tell by doing: adb shell id If it reports: uid=0(root) gid=0(root) then its a root shell, which is what most of us developers have. We're more interested in a test from a phone which reports something like: uid=2000(shell) gid=2000(shell) groups=... jlebar: so I tried it on my leo phone (which comes up with the shell user by default) and I got the following: 2189 >./get_about_memory.py Got 2/2 files. Pulled files into about-memory-0. Remote command /system/bin/toolbox "rm" /data/local/tmp/memory-report-1370139410-437.json.gz /data/local/tmp/memory-report-1370139410-149.json.gz failed with error code 255 rm failed for /data/local/tmp/memory-report-1370139410-437.json.gz, Operation not permitted Traceback (most recent call last): File "./get_about_memory.py", line 259, in <module> get_and_show_dump(args) File "./get_about_memory.py", line 177, in get_and_show_dump (merged_reports_path, dmd_files) = get_dumps(args) File "./get_about_memory.py", line 174, in get_dumps return utils.run_and_delete_dir_on_exception(do_work, out_dir) File "/home/work/B2G-leo/tools/include/device_utils.py", line 142, in run_and_delete_dir_on_exception return fun() File "./get_about_memory.py", line 160, in do_work optional_outfiles_prefixes=['dmd-']) File "/home/work/B2G-leo/tools/include/device_utils.py", line 216, in notify_and_pull_files _remove_files_from_device(all_outfiles_prefixes, old_files) File "/home/work/B2G-leo/tools/include/device_utils.py", line 310, in _remove_files_from_device remote_toolbox_cmd('rm', ' '.join([str(f) for f in files_to_remove])) File "/home/work/B2G-leo/tools/include/device_utils.py", line 56, in remote_toolbox_cmd return remote_shell('/system/bin/toolbox "%s" %s' % (cmd, args)) File "/home/work/B2G-leo/tools/include/device_utils.py", line 39, in remote_shell raise subprocess.CalledProcessError(retcode, cmd, cmd_out) subprocess.CalledProcessError ls -l /data/local/tmp reports: prw-rw-rw- root root 2013-06-01 19:15 fifo_cmd prw-rw-rw- root root 2013-06-01 19:15 fifo_dat -rw-r----- root root 24592 2013-06-01 19:16 memory-report-1370139410-149.json.gz -rw-r----- app_437 app_437 13264 2013-06-01 19:16 memory-report-1370139410-437.json.gz -rw-rw-rw- root root 13 2013-06-01 19:15 profile_calib_a -rw-rw-rw- root root 0 2013-06-01 19:14 profile_calib_m ---------- root root 12288 2013-06-01 19:15 sensord.log So, I think that the memory-report files are having their permissions masked by the umask in effect from /system/bin/b2g.sh (which does a umask 027) This means that we need to change the permissions on the files after we create them. Incidently, while it said it pulled 2 files, there was no local about-memory-0 directory. This makes sense, since the shell user can't read the files, so the pull would have failed. I'll clear qawanted until we have a patch that at least works on leo.
Keywords: qawanted
QA Contact: jsmith
Comment on attachment 755758 [details] [diff] [review] Untested patch (does seem to compile) Would you mind testing this, Dave? I'm hopeful this is all we need to do.
Attachment #755758 - Flags: feedback?(dhylands)
I'm going to morph this bug into the issue of making memory reports work on an unrooted device.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Summary: Modify B2G/tools/get_about_memory.py to work on an un-rooted device → Make memory-report gathering work on an unrooted B2G device
This is a critical blocking issue: Without this fix, we will be unable to retrieve memory reports off devices. These memory reports are a key tool for our understanding of a large class of bugs. If we can't make memory reports work, we will have no insight into these bugs, which will compromise our ability to improve new versions of the product. It's likely that whatever changes we have to make here will be very safe: They should only affect code which is run when dumping memory reports. The worst thing that should be able to happen with a patch here is that we break memory-report dumping. But it's already broken, so that's a no-brainer risk to take.
blocking-b2g: --- → tef?
Comment on attachment 755758 [details] [diff] [review] Untested patch (does seem to compile) Review of attachment 755758 [details] [diff] [review]: ----------------------------------------------------------------- So this change makes the files retrievable, however they're still not removable. The issue is that /data/local/tmp is created as a true temporary directory and has the sticky bit set (if you look in /system/bin/b2g.sh you'll see the line: chmod 1777 $TMPDIR The 1 is the sticky-bit. It allows the /data/local/tmp directory to be world-writable, but only allows the process which owns the file to remove it. So,if we were to create a /data/local/tmp/memory-reports directory and give it 0666 permissions and put the memory-report*.gz files in that directory, then the get_about_memory_.py could remove the files within that directory, but would have to leave the directory itself around, which I think is a reasonable compromise.
Attachment #755758 - Flags: feedback?(dhylands) → feedback+
I said that only the process which owns the file can remove it (in a directory with the sticky bit set), but I meant only the user which owns the file can remove it.
Lukas, can you decide on this one?
Whiteboard: [tef-triage]
What if we put it in /data/local instead?
/data/local/tmp is the only location writable by the child content processes. shell@android:/ $ ls -ld /data/local drwxr-x--x root root 2013-06-01 21:15 local So you still wouldn't be able to remove it. Note that /data/local/tmp gets cleared on a reboot.
Don't I want 777 on the directory, not 666? Otherwise I can list the directory but not read the files inside?
(In reply to Justin Lebar [:jlebar] from comment #16) > Don't I want 777 on the directory, not 666? Otherwise I can list the > directory but not read the files inside? Doh - of course :) What was I thinging (never mind - don't answer that).
Following up in email.
Attached patch Patch, v1Splinter Review
Are you up for reviewing this, Dave?
Attachment #756200 - Flags: review?(dhylands)
Attachment #755758 - Attachment is obsolete: true
Attachment #755664 - Attachment is obsolete: true
Comment on attachment 756200 [details] [diff] [review] Patch, v1 Review of attachment 756200 [details] [diff] [review]: ----------------------------------------------------------------- I tested this on my leo device with a "shell user" adb shell (as opposed to a root user adb shell) and it all seems to be working fine). I saw that the files got removed from /data/local/tmp/memory-reports. firefox was launched with the memory report. I'll attach it for completeness. I also wound up being slightly confused about the directory on the target being named "memory-reports" and the file on the host being named "memory-reports", but that's pretty minor. ::: xpcom/base/nsMemoryInfoDumper.cpp @@ +728,5 @@ > + nsAutoCString dirPath; > + rv = (*aFile)->GetNativePath(dirPath); > + NS_ENSURE_SUCCESS(rv, rv); > + > + while (chmod(dirPath.get(), 0777) == -1 && errno == EINTR) {} Since we're using ANDROID, we could use: TEMP_FAILURE_RETRY(chmod(dirPath.get(), 0777)); which wraps up the retry on -1 and EINTR. I'll leave it up to you to decide if you want to change this or not. TEMP_FAILURE_RETRY can be found in bionic/libc/include/unistd.h (so #include <unistd.h>) @@ +747,5 @@ > nsAutoCString path; > rv = file->GetNativePath(path); > NS_ENSURE_SUCCESS(rv, rv); > > + while (chmod(path.get(), 0666) == -1 && errno == EINTR) {} Same thing here.
Attachment #756200 - Flags: review?(dhylands) → review+
oh yeah - the gecko patch attached assumes that the obsoleted patch is also applied) - just an FYI - not a problem if you're landing yourself.
Attachment #756202 - Flags: review?(dhylands)
I guess I'd rather write portable code, even if it's Android only. The macro doesn't save a lot of space.
Attachment #756202 - Flags: review?(dhylands) → review+
(In reply to Justin Lebar [:jlebar] from comment #24) > I guess I'd rather write portable code, even if it's Android only. The > macro doesn't save a lot of space. The macro is also available from glibc. I think it may not have been available in earlier versions of android, but appears to be available now. It also appears that we have MOZ_TEMP_FAILURE_RETRY: https://mxr.mozilla.org/mozilla-central/source/xpcom/glue/FileUtils.h#157 What you've coded is fine. I just wanted to point this out for future reference (using the macro, you at least know it's coded properly).
Oh, if we have a macro in Mozilla code, that's nice. Although it doesn't check for EAGAIN (which isn't an issue here, but is an issue with some other syscalls). There's another chmod in this file that needs EINTR treatment. When we regain our sanity, we can convert them all to that macro.
Alex, if I understood correctly, you said in an e-mail that this could be tef+. Would you mind setting the flag?
(In reply to Justin Lebar [:jlebar] from comment #27) > Oh, if we have a macro in Mozilla code, that's nice. > > Although it doesn't check for EAGAIN (which isn't an issue here, but is an > issue with some other syscalls). > > There's another chmod in this file that needs EINTR treatment. When we > regain our sanity, we can convert them all to that macro. EAGAIN should only really apply to files which are opened non-blocking. I would like to believe that we're not using non-blocking I/O unless we absolutely need to, since non-blocking I/O implies polling. Using blocking I/O in conjunction with select/poll would be a better way to go. I'm just speaking in generalities here, but I figured I say my piece:)
We'll tef+ this for any remaining 1.0.1 downstream partners.
blocking-b2g: tef? → tef+
Dave, Justin, When this is uplifted, please add keywords: qawanted, verifyme. QA can work through verifying the fixes, assuming comment 5 is still the correct way to test it. if not, please add new steps to test as well. Thanks.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Took some fixing up to land on b2g18, but it appears to build OK. You still might want to take a look to make sure everything works as intended. https://hg.mozilla.org/releases/mozilla-b2g18/rev/c952a123b902 https://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/bb73520e9dce
Keywords: qawanted
QA Contact: jsmith
Confirmed working using the STR in comment 5 with my Inari 6/4 b2g18 build. I got my memory report successfully from my device and viewed it in the browser.
Also confirmed working on a partner customized 1.01 Ikura build from 6/4. I got my memory report successfully from my device and viewed it in the browser.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: