Closed
Bug 807163
Opened 12 years ago
Closed 9 years ago
logcat chassis 6 panda
Categories
(Testing :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: van, Unassigned)
Details
(Whiteboard: [reit-panda])
Attachments
(5 files)
The following pandas in chassis 6 are bad and I've attached the logcat for them. panda-0073 panda-0075 panda-0076 panda-0080 panda-0081
Reporter | ||
Comment 1•12 years ago
|
||
Reporter | ||
Comment 2•12 years ago
|
||
Reporter | ||
Comment 3•12 years ago
|
||
Reporter | ||
Comment 4•12 years ago
|
||
Comment 5•12 years ago
|
||
Carrying forward: (In reply to Van Le [:van] from comment #14) > I'm going to work on one chassis at a time and open bugs for the pandas I > can't get to come online. > > Chassis 6: I was NOT able to get the following pandas online after multiple > SD card swaps, reboots, etc... I have opened bug 807163 and attached the > logcat for the pandas. > > panda-00[73,75,76,80,81]
Comment 6•12 years ago
|
||
Can someone attach logs for a couple of "good" pandas, for comparison?
Comment 7•12 years ago
|
||
"adb shell dumpsys" (or is it "adb dumpsys"? I forget) from good/bad devices may also provide some clues.
Comment 8•12 years ago
|
||
This looks pretty telling: E/EthernetStateMachine( 1397): DhcpHandler: DHCP request failed: Timed out waiting for dhcpcd to start We should try to find what command line it's using there and run it ourselves to debug further.
Comment 9•12 years ago
|
||
It looks like the daemon is started by setting the 'ctl.start' property to something like 'dhcpcd_eth0:eth0' and then waiting on property 'init.svc.dhcpcd_eth0' to be set. It's timing out while waiting on that property. See dhcp_utils.c in the Linaro Android source.
Comment 10•12 years ago
|
||
Do these machines fail on a consistent basis or is it just sometimes?
Reporter | ||
Comment 11•12 years ago
|
||
:snorp, the ones in this bug are the ones I was unable to get online after several reboots/SD card swaps/reimages. Van
Comment 12•12 years ago
|
||
Is there any update on the status of these pandas?
Comment 13•12 years ago
|
||
Should we move these pandas to rack#10 and see if we can re-image them with mozpool? Remove block on bug 799698.
No longer blocks: 799698
Comment 14•10 years ago
|
||
can we close this bug? 14 months with no traction?
Comment 15•10 years ago
|
||
We shouldn't just close the bug -- we need to make a decision on these hardware units. The two choices I see listed are: a) :armenzg in comment 13 - physically move units to another location for further investigation. That would be 2 new bugs (1 to move, the other for investigation) b) my suggestion to declare them BER (beyond economic repair) and file a bug to decom them After that decision, and those bugs are filed, we can close this one. Callek: do you know enough about the current state to make the call?
Flags: needinfo?(bugspam.Callek)
Comment 16•10 years ago
|
||
I don't know enough to identify how difficult/costly a repair is. I do know however that we are very over-current-capacity needs with regard to pandas. So my recommendation is: (a) store pandas in a "possibly damaged" box, leaving asset tags in place, without sdcards, and document in inventory this bug number as a reference point incase we need to spend time on recovery in future (b) close this bug. Any human effort to recover from unknown panda device [hardware] conditions is, imo, not worth it at this time.
Flags: needinfo?(bugspam.Callek)
Comment 17•10 years ago
|
||
Okay, I like the minimize human effort -- so lets leave them in chassis if :dividhex agrees that won't hurt. That's save pulling, boxing, and then moving that box to scl3 Jake: I think this is our first decom-some-pandas-in-chassis case. Can you advise on: - whether it's okay to leave physically in the chassis - how inventory should be marked so we won't get confused. - any other needed changes to procedure in comment 16
Flags: needinfo?(jwatkins)
Comment 18•10 years ago
|
||
(In reply to Hal Wine [:hwine] (use needinfo) from comment #17) > Okay, I like the minimize human effort -- so lets leave them in chassis if > :dividhex agrees that won't hurt. That's save pulling, boxing, and then > moving that box to scl3 > > Jake: I think this is our first decom-some-pandas-in-chassis case. Can you > advise on: > - whether it's okay to leave physically in the chassis > - how inventory should be marked so we won't get confused. > - any other needed changes to procedure in comment 16 It is perfectly fine to leave them in the chassis until the can be properly decommed and removed. Not sure how to mark them in inventory since they aren't actually decommed (removed from chassis). Maybe "error/service." I also think it is a good idea to note the bug# in inventory. My bigger question is; are these really "bad" pandas? I can't find the original reason they were considered bad and needed dcops to look at them to begin with. This bug is also extremely old and I believe it was filed during the smoke test era. The environment around it has also changed alot since then. eg. mozpool upgrades, psu adjustments, ethernet cable reseating, chassis was moved to different pod, etc. It is possible the original interpretation of them as "bad" may have been from external influence. They all pass the basic mozpool selftest so is there any reason not to just add these back to the pool and see if they attract sheriff attention?
Flags: needinfo?(jwatkins)
Reporter | ||
Comment 19•9 years ago
|
||
no tracking for 1.5 years, going to close. let me know if this is still relevant.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•