Closed Bug 1008219 Opened 10 years ago Closed 10 years ago

8 p3 pandas may have sd card issues

Categories

(Infrastructure & Operations :: DCOps, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kmoir, Unassigned)

References

Details

Several of the pandas that were moved to scl3 are failing to verify and are in general behaving erratically.

The pandas that are causing problems are
308,310-315, 319

These ones are on my test foopy for my staging master and I've disabled them in slavealloc so I can get green results.  I haven't checked the ones attached to the other foopies that were moved.
Well that sounds ugly.
Are there some pandas (not 308, 310-315, 319) that behave well?
Or, are there any indications of what the problems might be?
Thanks.
:Kim, what do you mean by 'failing to verify'?  Can you elaborate?

First thing I would check is that the internal and external network cables are firmly seated in the ethernet coupler.  We should also try running the mozpool selftest.
(In reply to Jake Watkins [:dividehex] from comment #2)
> We should also try running the
> mozpool selftest.

Looks like someone beat me too it but they are doing an android install instead of a selftest.
Jake: I ran the selftest and they came up fine.  I'm wondering if this is just SD card corruption or an issue with the script.  We have 80 other pands to test, though, so this shouldn't be a blocker for the move.

"failed to verify" is an error from a script outside of mozpool:

http://hg.mozilla.org/build/tools/file/420092165403/sut_tools/verify.py#l389
Summary: many p3 pandas failing to verify → 8 p3 pandas may have sd card issues
If mozpool selftests are passing into ready state but still seem to fall over while running android, then it could be either loose net cable, faulty sd card or faulty panda board.  The fixing steps should be in this order.  If one doesn't work move on to the next.

1. re-seat netcable on all ports (panda, both coupler sides, and switch)
2. replace sdcard
3. decomm/replace pandaboard
colo-trip: --- → scl3
So we thought that there might be an issue today with outdated versions of code on the foopies.  (The ones that were recently imaged had an different rev)  In any case, I updated them all today to have the latest revision of tools, and reran the tests on my master and the same pandas still had problems falling over.  So I please go ahead and reset them etc as per comment #5, the software update didn't fix anything.
All of the mentioned pandas had the cables reseated and SD cards swapped.  However panda-0312 had its fuse popped.  Will replace the fuse next week.  

:kmoir - Let me know if any of the pandas are still failing.
Running tests now on the pandas that had their cables reseated and SD cards swapped
Pandas look much better now, thanks!
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Faulty fuse has been replaced along with SD card.  Panda-0312 is now in ready state.
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.