Closed
Bug 797629
Opened 12 years ago
Closed 12 years ago
Image 12 pandas for B2G by hand by 10/9/12 (Chassis 3)
Categories
(Infrastructure & Operations :: DCOps, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: cmtalbert, Unassigned)
References
Details
(Whiteboard: [reit-panda])
Attachments
(1 file)
1.04 KB,
application/x-sh
|
Details |
In order to test the rest of our automation setup for B2G on pandas in the real chassis configuration (so we have PDUs available to us) we would like to image one box of pandas with a B2G build.
The image by hand process involves a data center trip. Here's how it works:
1. I'll image 12 sdcards
2. I'll deliver 12 sdcards to either dividehex or someone in DCOps
3. That someone puts the 12 sdcards into the pandas in the box that we're going to use for this and power cycles them.
Then we just need to know their IP addresses and we can start working with them. This way we can ensure the rest of the automation infrastructure (mozharness scripts, devicemanagement library, test runners etc) all work well with the B2G pandas so that when we are ready to go with the automatic flashing we can have confidence in the rest of the B2G panda automation stack.
I should have these sdcards tomorrow (10/4) in mountain view, let me know where to drop them off.
OK, The AOSP kernel that we are using for current panda builds does not allow us to persist mac addresses. That means that until we solve that problem, I'll need to use fixed mac addresses for these pandas now. Can I get 12 IP addresses to fix these pandas to?
If I could get the names of the pandas as well I'll be sure to label the 12 sd cards with the corresponding panda they are destined for.
I have the image ready to go otherwise and will purchase the 12 cards this afternoon.
Updated•12 years ago
|
Whiteboard: [reit-panda]
Comment 2•12 years ago
|
||
If you need further instruction on how to image the panda SD cards, please sync up with Jake.
Assignee: server-ops-releng → server-ops
Component: Server Operations: RelEng → Server Operations: DCOps
QA Contact: arich → dmoore
Comment 3•12 years ago
|
||
Clint, I just talked with Jake, and we're going to allocate chassis 3 to this. Those include:
panda-relay-03 IN A 10.12.52.135
panda-034 IN A 10.12.52.136
panda-035 IN A 10.12.52.137
panda-036 IN A 10.12.52.138
panda-037 IN A 10.12.52.139
panda-038 IN A 10.12.52.140
panda-039 IN A 10.12.52.141
panda-040 IN A 10.12.52.142
panda-041 IN A 10.12.52.143
panda-042 IN A 10.12.52.144
panda-043 IN A 10.12.52.145
panda-044 IN A 10.12.52.146
panda-045 IN A 10.12.52.147
Please be sure to label the SD cards so that DCOps can insert them into the proper pandas.
Comment 4•12 years ago
|
||
Clint: also, DCOps should already have SD cards that you can use. We bought them with the pandas.
Comment 5•12 years ago
|
||
Sorry, those are attached to
panda-relay-04 IN A 10.12.52.148
Updated•12 years ago
|
colo-trip: --- → mtv1
Comment 6•12 years ago
|
||
I've done the renaming for all of the pandas and the relays to further zero pad them and to make sure that a relay board number matches up with the chassis number.
So, chassis 3 has:
panda-relay-003 IN A 10.12.52.135
panda-0034 IN A 10.12.52.136
panda-0035 IN A 10.12.52.137
panda-0036 IN A 10.12.52.138
panda-0037 IN A 10.12.52.139
panda-0038 IN A 10.12.52.140
panda-0039 IN A 10.12.52.141
panda-0040 IN A 10.12.52.142
panda-0041 IN A 10.12.52.143
panda-0042 IN A 10.12.52.144
panda-0043 IN A 10.12.52.145
panda-0044 IN A 10.12.52.146
panda-0045 IN A 10.12.52.147
Comment 7•12 years ago
|
||
Hi Jake,
Can we schedule time with you on Monday when you're in MTV so you can run us through the imaging process?
Thanks,
Van
Updated•12 years ago
|
colo-trip: mtv1 → scl1
Comment 8•12 years ago
|
||
Van: clint should be handing you pre-imaged cards for these, from what I understand. I think you just have to put them in the appropriate pandas.
Comment 9•12 years ago
|
||
Clint,
Are these SD cards imaged and ready to be picked up by DCOPs?
Thanks,
Van
Reporter | ||
Comment 10•12 years ago
|
||
Handed Vinh an sdcard that was imaged and instructions on how to image the rest of the cards since he graciously offered to handle it.
Big thanks to everyone here.
Updated•12 years ago
|
Summary: Image 12 pandas for B2G by hand by 10/9/12 → Image 12 pandas for B2G by hand by 10/9/12 (Chassis 3)
Comment 11•12 years ago
|
||
update: We were able to image the SD cards by specifying 100MB block size when dd'ing. If we didn't specify that block size, there would be a 1 block parity error and partitions would be missing. It still takes about 20+ minutes per reimage.
However we are running into a new issue with the board not completely booting up.
Logs when serial dongle is attached to a panda board:
[ 17.026824] ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 17.132995] request_suspend_state: wakeup (0->0) at 17118347171 (1970-01-01 00:00:17.114929203 UTC)
shell@android:/ $
shell@android:/ $ netcfg
/system/bin/sh: netcfg: cannot execute - Permission denied
126|shell@android:/ $ sudo netcfg
/system/bin/sh: sudo: not found
[ 87.315307] request_suspend_state: sleep (0->3) at 87300659182 (1970-01-01 00:01:27.297241214 UTC)
[ 87.326263] DSSCOMP: dsscomp_early_suspend
[ 87.345855] DSSCOMP: blanked screen
:ctalbert, I am curious if the panda boards in chassis 3 is any different from the one on your desk? If it isn't, could it be facing the same issue that came up with the SD card imaged by OSX? You were able to toy with it to make it see eth0 when it came up as wlan0.
Comment 12•12 years ago
|
||
Per discussion with :ctalbert in email, he's willing to share this chassis with releng to unblock panda-for-android foopy work.
Please reserve & set up 6 of the boards per Clint's original request, and 6 with the image from bug 769428 comment 15, even though there are problems as documented in bug 798519 comment 2.
Comment 13•12 years ago
|
||
We've configured the SD cards but the nodes still arent coming up.
This is the only line we edited:
( ifconfig eth0 10.12.52.139 netmask 255.255.252.0 up ) & sleep 5
Please confirm the network.sh file. (This one belongs to panda-0037).
Comment 14•12 years ago
|
||
Comment 15•12 years ago
|
||
I tried changing the word "dhcp" to "static" in network.sh hoping that would work, but it didnt. When I am on serial, I still cant get the netcfg command to work.
shell@android:/ $ netcfg
/system/bin/sh: netcfg: cannot execute - Permission denied
Reporter | ||
Comment 16•12 years ago
|
||
That looks correct.
Ensure that you made the network.sh file executable: chmod 755 /system/bin/network.sh
Reporter | ||
Comment 17•12 years ago
|
||
Sorry for the double comment, but I wanted to be clear. The patch in comment 14 is correct. changing "dhcp" to "static" doesn't seem to work on B2G's linux system. I think the issue you hit in 15 is due to network.sh not being executable.
Comment 18•12 years ago
|
||
clint, network.sh is executable and it's still not working.
The file I attached may have different permissions because it was a copy/paste I did onto a USB drive to attach it to this ticket.
[ 16.719757] ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 17.110778] request_suspend_state: wakeup (0->0) at 17095916751 (1970-01-01 00:00:17.090271000 UTC)
[ 17.365447] eth0: no IPv6 routers present
[ 21.806427] init: untracked pid 121 exited
[ 87.172760] request_suspend_state: sleep (0->3) at 87157989503 (1970-01-01 00:01:27.151763918 UTC)
[ 87.182861] DSSCOMP: dsscomp_early_suspend
[ 87.212829] DSSCOMP: blanked screen
Comment 19•12 years ago
|
||
clint,
this is what I see when I cat data/local/netcf.log
output of netcfg:
lo UP 127.0.0.1/8 0x00000049 00:00:00:00:00:00 ifb0 DOWN 0.0.0.0/0 0x00000082 2e:7b:bd:29:38:b1 ifb1 DOWN 0.0.0.0/0 0x00000082 6a:94:8e:93:82:92 sit0 DOWN 0.0.0.0/0 0x00000080 00:00:00:00:00:00 ip6tnl0 DOWN 0.0.0.0/0 0x00000080 00:00:00:00:00:00 eth0 UP 0.0.0.0/0 0x00001043 92:92:cf:eb:f9:b3 wlan0 UP 0.0.0.0/0 0x00001043 de:ad:be:ef:00:00
running ifconfig
lo UP 127.0.0.1/8 0x00000049 00:00:00:00:00:00 ifb0 DOWN 0.0.0.0/0 0x00000082 2e:7b:bd:29:38:b1 ifb1 DOWN 0.0.0.0/0 0x00000082 6a:94:8e:93:82:92 sit0 DOWN 0.0.0.0/0 0x00000080 00:00:00:00:00:00 ip6tnl0 DOWN 0.0.0.0/0 0x00000080 00:00:00:00:00:00 eth0 UP 10.12.52.138/22 0x00001043 92:92:cf:eb:f9:b3 wlan0 UP 0.0.0.0/0 0x00001003 de:ad:be:ef:00:00
ran
USER PID PPID VSIZE RSS WCHAN PC NAME root 103 1 768 372 c004f96c 4005ae74 S /system/bin/sh root 134 103 3284 784 c0118f70 40084438 S sutagent
insut
USER PID PPID VSIZE RSS WCHAN PC NAME root 103 1 768 372 c004f96c 4005ae74 S /system/bin/sh root 134 103 3284 784 c0118f70 40084438 S sutagent
insut
USER PID PPID VSIZE RSS WCHAN PC NAME root 103 1 768 372 c004f96c 4005ae74 S /system/bin/sh root 134 103 3284 784 c0118f70 40084438 S sutagent
insut
USER PID PPID VSIZE RSS WCHAN PC NAME root 103 1 768 372 c004f96c 4005ae74 S /system/bin/sh root 134 103 3284 784 c0118f70 40084438 S sutagent
insut
the last line repeats over and over.
Reporter | ||
Comment 20•12 years ago
|
||
Interesting. That's what the output said when I tested it here locally and I could ping the device. It looks like the eth0 interface is up and running with the network address that you provided to it. I'm not sure at all why at this point that it isn't working. The ethernet cord is all connected right? (Sorry for asking the obvious question, but I'm running out of ideas.)
Comment 21•12 years ago
|
||
I figured out what might be causing the issue. The netmask is hard coded as a /22 regardless of what the netmask is configured in network.sh. We need it to be a /21.
lo UP 127.0.0.1/8 0x00000049 00:00:00:00:00:00 ifb0 DOWN 0.0.0.0/0 0x00000082 32:76:03:6d:94:6c ifb1 DOWN 0.0.0.0/0 0x00000082 2a:41:d1:bd:cc:67 sit0 DOWN 0.0.0.0/0 0x00000080 00:00:00:00:00:00 ip6tnl0 DOWN 0.0.0.0/0 0x00000080 00:00:00:00:00:00 eth0 UP 10.12.52.137/22 0x00001043 8e:8e:fe:92:c6:85 wlan0 UP 0.0.0.0/0 0x00001003 de:ad:be:ef:00:00
Van
Comment 22•12 years ago
|
||
What is the status in here?
Thanks in advance.
Comment 23•12 years ago
|
||
Armen: the image doesn't work as is (it overrides the netmask). I think Tom Zimmerman may have a working image soon that fixes these issues, but I defer to him.
Comment 24•12 years ago
|
||
Thanks Amy!
Hi Thomas, is bug 798427 (which got fixed) what is giving issues on this bug?
Comment 25•12 years ago
|
||
Hi
(In reply to Van Le [:van] from comment #15)
> I tried changing the word "dhcp" to "static" in network.sh hoping that would
> work, but it didnt. When I am on serial, I still cant get the netcfg command
> to work.
>
> shell@android:/ $ netcfg
> /system/bin/sh: netcfg: cannot execute - Permission denied
Oh, I made this mistake as well. When you login over serial console, you are user 'shell', which has almost no permissions. You are root if you
1) login via 'adb shell'
or
2) (iirc) change user and group to 'root' in lines 350/351 of the file system/core/rootdir/init.rc, and re-build.
You should be able to run any commands now.
Comment 26•12 years ago
|
||
(In reply to Armen Zambrano G. [:armenzg] from comment #24)
> Thanks Amy!
>
> Hi Thomas, is bug 798427 (which got fixed) what is giving issues on this bug?
Hmm, I don't see how. The fixes for this bug have nothing to do with IP addressing. They only make sure that each Ethernet NIC gets a constant MAC address.
I left a comment about getting root access on the PandaBoard. Could someone try this? If you cannot fix the problem this way, I'll take a look if this is kernel-related.
Comment 27•12 years ago
|
||
Thanks Thomas for the info.
Van, does this info help? When could this be tried?
Comment 28•12 years ago
|
||
All,
DC Ops does not yet have the proper equipment for running the adb tools or rebuilding images. We're picking up some linux laptops for this purpose, but it will be several days (at best) before we're equipped to do this for you.
Comment 29•12 years ago
|
||
Hi,
I am setting up my Linux laptop with adb tools. Which Android SDK Platform do I install to work with these Panda boards? Is there any harm installing all the SDK Platforms?
Van
Comment 30•12 years ago
|
||
Hi
> I am setting up my Linux laptop with adb tools. Which Android SDK Platform
> do I install to work with these Panda boards? Is there any harm installing
> all the SDK Platforms?
Should be straight-forward. I downloaded the latest SDK [1] and unpacked it somewhere into my home directory. Adb and other useful tools are located in the sub-directory platform-tools/, which I added to my PATH variable.
Thomas
[1] http://dl.google.com/android/android-sdk_r20.0.3-linux.tgz
Comment 31•12 years ago
|
||
van, what is the status in here? Thanks in advance.
I think this should block bug 805016.
Comment 32•12 years ago
|
||
This bug does not block 805016, that imaging process is specific for Android, this bug is specifically for b2g
Comment 33•12 years ago
|
||
Van is onsite heading up some high-priority work in our Phoenix datacenter. He'll be back in scl1 on 10/26.
Comment 34•12 years ago
|
||
:armenzg, I am not sure what update you are looking for. This is for b2g and the issue we are currently facing is that even though I change the netmask in the script network.sh, the image is hard coded or over riding my change and keeping it as a /22.
Van
Comment 35•12 years ago
|
||
Hi van,
Thanks for the clarification.
Would this mean that the A-team or someone else needs to get you an image that does not override it?
BTW did you get to try what Thomas suggested? (see comment below). I could have misunderstood the comment or missed something. Please correct me if I got it wrong.
(In reply to Thomas Zimmermann from comment #25)
> Hi
>
> (In reply to Van Le [:van] from comment #15)
> > I tried changing the word "dhcp" to "static" in network.sh hoping that would
> > work, but it didnt. When I am on serial, I still cant get the netcfg command
> > to work.
> >
> > shell@android:/ $ netcfg
> > /system/bin/sh: netcfg: cannot execute - Permission denied
>
> Oh, I made this mistake as well. When you login over serial console, you are
> user 'shell', which has almost no permissions. You are root if you
>
> 1) login via 'adb shell'
>
> or
>
> 2) (iirc) change user and group to 'root' in lines 350/351 of the file
> system/core/rootdir/init.rc, and re-build.
>
> You should be able to run any commands now.
Comment 36•12 years ago
|
||
Hi Armen,
I believe Thomas's comments were to show me how to get access to run commands and to capture any logs you guys would want. This doesn't resolve the issue we're am stuck at.
Thanks,
Van
Reporter | ||
Comment 37•12 years ago
|
||
Status update: mdas and I are working with tzimmerman to get his new build working on a pandaboard.
The build is currently broken, but we have verified that the kernel MAC address fixes worked--The Panda has a consistent MAC address now with the new kernel. Once we sort out why the Gecko code isn't landing on the pandaboard after flashing, we can make a new image that you can use on the boards.
The instructions once you have the new image are pretty simple:
1. Download image file and unzip it.
2. dd the image file to one sdcard
3. Duplicate the sdcard image to the other pandas
4. Boot the pandas.
We shouldn't need to much with the network.sh file anymore.
Thanks for your patience.
Reporter | ||
Comment 38•12 years ago
|
||
The new image is here: http://people.mozilla.org/~ctalbert/b2g-panda-16gb.img.gz
However, it only seems to boot on half the pandas we have here. Mdas and I spent two days trying to debug this and we can't figure it out. I've opened bug 806096 to track this issue. In the meantime, please try to image the sdcards of the pandas in chassis 3 with this image and let's see how many of them boot. I expect 50% of them to work because that's what we're seeing here.
At least if we can get some to work, then we can continue working on other elements of the automation.
Comment 39•12 years ago
|
||
Clint, do I have to specify a block size when I dd the image? And to confirm, these will get their IPs through DHCP?
Reporter | ||
Comment 40•12 years ago
|
||
(In reply to Van Le [:van] from comment #39)
> Clint, do I have to specify a block size when I dd the image? And to
> confirm, these will get their IPs through DHCP?
I imaged it with both bs=100M and with no block size specified at all, both worked (as well as this image works at all).
And yes, these pandas will have static MAC addresses and so will be able to get their static IP addresses via DHCP.
Comment 41•12 years ago
|
||
I was able to get all the pandas up in this chassis with the new b2g image except panda-0040. I tried several SD cards/reimages to no avail.
[root@admin1]~# fping panda-00{34..45}.build.scl1.mozilla.companda-0034.build.scl1.mozilla.com is alive
panda-0035.build.scl1.mozilla.com is alive
panda-0036.build.scl1.mozilla.com is alive
panda-0037.build.scl1.mozilla.com is alive
panda-0038.build.scl1.mozilla.com is alive
panda-0039.build.scl1.mozilla.com is alive
panda-0041.build.scl1.mozilla.com is alive
panda-0042.build.scl1.mozilla.com is alive
panda-0043.build.scl1.mozilla.com is alive
panda-0044.build.scl1.mozilla.com is alive
panda-0045.build.scl1.mozilla.com is alive
panda-0040.build.scl1.mozilla.com is unreachable
Van
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
Assignee: server-ops → server-ops-dcops
Updated•10 years ago
|
Product: mozilla.org → Infrastructure & Operations
You need to log in
before you can comment on or make changes to this bug.
Description
•