Closed Bug 1048024 Opened 10 years ago Closed 10 years ago

b2g-info and get_about_memory.py are broken ("Two B2G main processes found", "Two copies of b2g process found?")

Categories

(Firefox OS Graveyard :: GonkIntegration, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: hub, Assigned: ting)

References

Details

(Keywords: perf, qablocker, regression, Whiteboard: [c=automation p= s=2014.08.15.t u=])

Attachments

(2 files, 1 obsolete file)

49 bytes, text/x-github-pull-request
dhylands
: review+
Details | Review
43 bytes, text/x-github-pull-request
dhylands
: review+
Details | Review
b2g-info is broken on flame.

My gonk-misc tree is at d61daef8fca7d6f335f659a8967bad423770e634

I did a full flash

When I do |adb shell b2g-info| I get

Fatal error: Two B2G main processes found (pids 311 and 349)

$ adb shell b2g-ps  
APPLICATION    SEC USER     PID   PPID  VSIZE  RSS     WCHAN    PC         NAME
b2g              0 root      311   1     181272 83252 ffffffff b6f4267c S /system/b2g/b2g
(Nuwa)           0 root      349   311   54572  14052 ffffffff b6f4267c S /system/b2g/b2g
Built-in Keyboa  2 u0_a902   902   349   64152  24492 ffffffff b6f4267c S /system/b2g/b2g
Homescreen       2 u0_a997   997   349   74424  29664 ffffffff b6f4267c S /system/b2g/b2g
Smart Collectio  2 u0_a1049  1049  311   70916  27568 ffffffff b6f2267c S /system/b2g/plugin-container
(Preallocated a  2 u0_a1050  1050  349   61496  19132 ffffffff b6f4267c S /system/b2g/b2g
Homescreen       2 u0_a1062  1062  311   70916  27388 ffffffff b6ecd67c S /system/b2g/plugin-container
(Preallocated a  2 u0_a1071  1071  349   61496  19140 ffffffff b6f4267c S /system/b2g/b2g



This break the performance testing.
Keywords: perf, regression
Whiteboard: [c=automation p= s= u=]
Very likely due to bug 977026
Depends on: 977026
Flags: needinfo?(tlee)
We need to fix this quickly or back out bug 977026.
tools/get_about_memory.py is similar broken:

> Exception: Two copies of b2g process found?
Summary: b2g-info broken on flame → b2g-info broken on flame ("Two B2G main processes found")
And I'm seeing it on Buri, so it's not just Flame. I suspect it's all devices.
Summary: b2g-info broken on flame ("Two B2G main processes found") → b2g-info and get_about_memory.py are broken ("Two B2G main processes found", "Two copies of b2g process found?")
Having looked at processinfo.cpp it seems that the detection of the B2G process is based solely on the /proc/PID/exe value.

But there was already a patch proposed in bug 977026 attachment 8414349 [details] [diff] [review]. Why did has not this landed ?
Flags: needinfo?(tchou)
Sorry about that, I will create PRs and ask for reviewing, please let me know if I should move the work here as bug 977026 has been closed.
Flags: needinfo?(tchou)
Attached file PR for b2g-info (obsolete) —
Thinker asked me to put PRs here.
Attachment #8466907 - Flags: review?(mwu)
Can you also fix the get_about_memory code?
Flags: needinfo?(tchou)
Sure, I am verifying the patches as they were created a while ago.
Flags: needinfo?(tchou)
Comment on attachment 8466907 [details] [review]
PR for b2g-info

I just noticed Built-in Keyboard and Homescreen still has /system/b2g/plugin-container as there /proc/[pid]/exe, needs some checking here.
Attachment #8466907 - Flags: review?(mwu) → review-
Attachment #8466907 - Attachment is obsolete: true
Talked to Thinker, it's because of bug 1033618 so there're chances the binary is plugin-container, I am recreating the patches.
Attached file PR for b2g-info
Attachment #8466928 - Flags: review?(mwu)
Attachment #8466931 - Flags: review?(dhylands)
ting is working on this.
Assignee: nobody → tchou
Flags: needinfo?(tlee)
(In reply to Nicholas Nethercote [:njn] from comment #4)
> And I'm seeing it on Buri, so it's not just Flame. I suspect it's all
> devices.

Not surprising. And given that we don't update gonk on Buri, it is gonna get trickier.
Attachment #8466928 - Flags: review?(mwu) → review?(dhylands)
Blocks: 1044297
[Blocking Requested - why for this release]:

Test blocker for performance & a regression.
blocking-b2g: --- → 2.1?
Keywords: qablocker
Blocks: 1047355
Comment on attachment 8466928 [details] [review]
PR for b2g-info

Sorry for the delay with this review (Mon was a stat holiday here and I was away from my computer).

I only had one minor comment, and that could be done in a followup bug.
Attachment #8466928 - Flags: review?(dhylands) → review+
Comment on attachment 8466931 [details] [review]
PR for get_about_memory

This looks reasonable.
Attachment #8466931 - Flags: review?(dhylands) → review+
(In reply to Dave Hylands [:dhylands] from comment #18)
> I only had one minor comment, and that could be done in a followup bug.

Please let me know your comment, should I create the followup bug then?
Oh, I(In reply to Ting-Yu Chou [:ting] from comment #21)
> (In reply to Dave Hylands [:dhylands] from comment #18)
> > I only had one minor comment, and that could be done in a followup bug.
> 
> Please let me know your comment, should I create the followup bug then?

Oh, I found it on github.
I updated my B2G repo and my copy of b2g-inbound and rebuilt and re-flashed. I'm still seeing this problem on my Buri.
Did you run ./repo sync?

Did you flash the gonk layer (note that I don't know if this is safe to do on buri)?
I only flashed gecko and gaia. I've never flashed gonk on my Buri.
What if replace just /system/bin/b2g-info?
I just did git |git pull origin master| and now tools/get_about_memory.py is working. But b2g-info is still broken. So there are still two b2g processes -- I guess that's intentional?
Originally b2g-info expects there's only one process has /system/b2g/b2g in its /proc/[pid]/exe, but as of bug 977026 landing, it is no more true. That's why b2g-info is complaining there are two b2g processes found.

The patch here tries to check not only /proc/[pid]/exe, but also /proc/[pid]/comm.
> What if replace just /system/bin/b2g-info?

What commands should I run to do that? Thanks.
This works on my Flame:

|adb root; adb remount; adb push out/target/product/.../system/bin/b2g-info /system/bin/b2g-info|

prbably you want to backup at first:

|adb pull /system/bin/b2g-info b2g-info.bak|
I'm still getting this even after flashing the entire device (nexus 5) and manually pushing b2g-info.
the tools/include/device_utils.py script is part of the B2G repo, which is NOT updated when you do a repo sync.

Did you do a git pull to update your B2G tree?

> git log -1 tools/include/device_utils.py

should show:

> commit 9b57f9d9bcc0dadc8a6671d30483477e6fe2e1b4
> Author: Ting-Yu Chou <janus926@gmail.com>
> Date:   Mon Aug 4 16:27:50 2014 +0800
> 
>     Bug 1048024 - Updates for forking B2G and Nuwa from the same process.

if it doesn't you probably didn't update your B2G tree.

Note: this particular problem (B2G tree not being updated) should be alleviated by 
https://groups.google.com/forum/#!topic/mozilla.dev.b2g/8skw-tE80as
Why does that matter for b2g-info?

I've confirmed that I have a b2g-info with Ting-Yu's changes.  I've also confirmed that the new binary is ending up on the device by changing the error message.  This is not fixed, at least for b2g-info on a Nexus 5.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #33)
> Why does that matter for b2g-info?
> 
> I've confirmed that I have a b2g-info with Ting-Yu's changes.  I've also
> confirmed that the new binary is ending up on the device by changing the
> error message.  This is not fixed, at least for b2g-info on a Nexus 5.

What does b2g-ps show on your nexus 5?
khuey@minbar:~/dev/B2G$ adb shell b2g-ps
APPLICATION    SEC USER     PID   PPID  VSIZE  RSS     WCHAN    PC         NAME
b2g              0 root      4719  1     236956 107492 ffffffff b6ede73c S /system/b2g/b2g
b2g              0 root      4736  4719  55356  5376  ffffffff b6ede908 S /system/b2g/b2g
Homescreen       2 u0_a4902  4902  4719  85724  36648 ffffffff b6e7273c S /system/b2g/plugin-container
Built-in Keyboa  2 u0_a5004  5004  4719  89540  33208 ffffffff b6eb673c S /system/b2g/plugin-container
Hmm. That means that the Nuwa process didn't rename itself to (Nuwa).

I knew that there would be a window where b2g-info would fail (the time period between when Nuwa was launched and when it renamed itself), but I was assuming it would be fairly small.
The current test implemented was:

> if my cmdline ends in b2g and my comm is b2g

If we changed this to:

> if my cmdline ends in b2g and my parents comm isn't b2g

then it would close the window, and should work for all of the cases I can think of (since init won't always be the parent - if you start b2g by manually running b2g.sh or start it using the debugger).
> This works on my Flame:
> 
> |adb root; adb remount; adb push out/target/product/.../system/bin/b2g-info
> /system/bin/b2g-info|

It's working on my Buri now. Thanks!
On my Nexus 5 it doesn't ever rename itself.  I guess I should figure out why.
njn says this is fixed so I'll open a new bug.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
For what it's worth, I'm seeing the same thing on my flame (no process named Nuwa).

khuey - please cc me on the new bug. I'm working on a patch
(In reply to Dave Hylands [:dhylands] from comment #38)
> If we changed this to:
> 
> > if my cmdline ends in b2g and my parents comm isn't b2g
> 

With this test, if following processes are running, won't Homescreen and Preallocated be counted?

APPLICATION    SEC USER     PID   PPID  VSIZE  RSS     WCHAN    PC         NAME
b2g              0 root      299   1     163264 75416 ffffffff b6f4a67c S /system/b2g/b2g
(Nuwa)           0 root      338   299   54888  14040 ffffffff b6f4a67c S /system/b2g/b2g
Homescreen       2 u0_a868   868   338   73112  26492 ffffffff b6f4a67c S /system/b2g/b2g
(Preallocated a  2 u0_a975   975   338   61812  19196 ffffffff b6f4a67c S /system/b2g/b2g
You're right.

The patch I've actually implemented (for b2g-info) is this:

Determine the set of b2g processes by finding all processes whose exe is /system/b2g/b2g or /system/b2g/plugin-container.

For each b2g process, if the ppid is contained within the set, then consider it to be a child process. If the ppid is not contained within the set, then consider it to be the master process.

This covers arbitrarily deep hierarchies of processes.
Whiteboard: [c=automation p= s= u=] → [c=automation p= s=2014.08.15.t u=]
blocking-b2g: 2.1? → ---
Just saw this on a Flame flashed with:

Gaia-Rev        7c9e7cabbde941b976e0e40a3a1d94e21aa9c5e9
Gecko-Rev       https://hg.mozilla.org/mozilla-central/rev/62990ec7ad78
Build-ID        20141105040206
Version         36.0a1
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  40
FW-Date         Tue Oct 21 15:59:42 CST 2014
Bootloader      L1TC10011880

Gonk is v188-1, memory limit is 319MB.

STR:

# adb shell b2g-info
Fatal error: Two B2G main processes found (pids 203 and 458)
# adb shell b2g-ps  
APPLICATION    SEC USER     PID   PPID  VSIZE  RSS     WCHAN    PC         NAME
b2g              0 root      203   1     211332 70348 ffffffff b6f058ac S /system/b2g/b2g
(Nuwa)           0 root      458   203   73100  10032 ffffffff b6f058ac S /system/b2g/b2g
Homescreen       2 u0_a952   952   458   102440 41028 ffffffff b6f058ac S /system/b2g/b2g
Find My Device   2 u0_a1220  1220  458   82912  21816 ffffffff b6f058ac S /system/b2g/b2g
Usage            2 u0_a1255  1255  203   93312  33628 ffffffff b6f438ac S /system/b2g/plugin-container
(Preallocated a  2 u0_a1419  1419  458   80340  18076 ffffffff b6f058ac S /system/b2g/b2g
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Have you done a full flash?  Seems like a problem with the base image.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
v188 is 2.0 based so it comes without this fix for b2g-info. If you upgraded to 2.1 or later, you should have a full flash, not a shallow flash.
Hi Youlong, as requested in mail pls help include this patch into coming Flame base image release, thank you.
Flags: needinfo?(youlong.jiang)
verify with the latest base image v18B + v2.1 gaia/gecko shallow flash, it's fine
Flags: needinfo?(youlong.jiang)
You need to log in before you can comment on or make changes to this bug.