Closed Bug 803685 Opened 13 years ago Closed 12 years ago

failures when running mochitests on new pandas

Categories

(Testing :: General, defect)

x86
Android
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kmoir, Unassigned)

References

Details

(Whiteboard: [re-panda])

Attachments

(1 file)

Callek and I talked to wlach in #ateam and he said that this issue stems from the fact that su is probably trying to call out to the android superuser app and that's failing. He said he thought jmaher fixed that issue a while ago so the possible causes are 1) the image is on the pandas is old 2) there's a regression I'm seeing this problem when running tests on pandas in chassis 2,4 and 5 Also, is there a way to determine the version of the image on the panda? Here's the log. Android Panda mozilla-central opt test mochitest-7 python /builds/sut_tools/installApp.py 10.12.52.152 build/fennec-19.0a1.en-US.android-arm.apk org.mozilla.fennec in dir /builds/panda-0049/test/. (timeout 1200 secs) watching logfiles {} argv: ['python', '/builds/sut_tools/installApp.py', '10.12.52.152', 'build/fennec-19.0a1.en-US.android-arm.apk', 'org.mozilla.fennec'] environment: HOME=/home/cltbld PATH=/tools/buildbot-0.8.4-pre-moz2/bin:/usr/local/bin:/usr/local/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/cltbld/bin PWD=/builds/panda-0049/test SUT_IP=10.12.52.152 SUT_NAME=panda-0049 using PTY: False 10/19/2012 12:32:48: INFO: copying build/fennec/application.ini to build/talos/remoteapp.ini 10/19/2012 12:32:48: DEBUG: calling [cp build/fennec/application.ini build/talos/remoteapp.ini] 10/19/2012 12:32:48: DEBUG: cp: cannot create regular file `build/talos/remoteapp.ini': No such file or directory 10/19/2012 12:32:48: INFO: connecting to: 10.12.52.152 reconnecting socket 10/19/2012 12:32:48: INFO: devroot /mnt/sdcard/tests 10/19/2012 12:32:48: INFO: /builds/panda-0049/test/../proxy.flg 10/19/2012 12:33:18: INFO: 10.12.52.34, 50049 10/19/2012 12:33:18: INFO: Current device time is 2012/10/19 19:33:18 10/19/2012 12:33:18: INFO: Setting device time to 2012/10/19 12:33:18 10/19/2012 12:38:19: WARNING: Exception while setting device time: Automation Error: Timeout in command settime 2012/10/19 12:33:18 Traceback (most recent call last): File "/builds/sut_tools/installApp.py", line 195, in <module> sys.exit(main(sys.argv)) File "/builds/sut_tools/installApp.py", line 174, in main dm, devRoot = one_time_setup(ip_addr, path_to_main_apk) File "/builds/sut_tools/installApp.py", line 134, in one_time_setup getDeviceTimestamp(dm) File "/builds/tools/sut_tools/sut_lib.py", line 446, in getDeviceTimestamp ts = int(dm.getCurrentTime()) # epoch time in milliseconds TypeError: int() argument must be a string or a number, not 'NoneType' program finished with exit code 1 elapsedTime=632.992348
Blocks: 799698
this is an issue with the settime/gettime stuff. I saw this originally and I wonder why this is cropping up again. It could be related to su. I had done a series of modifications to the sut_tools and the changes are captured in patches in bug 781341 and bug 797868. While I am not sure if those changes will fix the problem, I would make sure you are running a 'custom' version of sut_tools. thanks for looking into this and we can figure this out without much trouble.
(In reply to Joel Maher (:jmaher) from comment #1) > this is an issue with the settime/gettime stuff. I saw this originally and > I wonder why this is cropping up again. It could be related to su. To add to the details here, wlach had suggested we try |execsu id| from the SUTAgent prompt, which seemed to "hang" in the connection, e.g. no output and no new prompt, nor any SUT error. If we disconnected, manually we could reconnect via the SUTAgent port and get a new prompt though.
also make sure you are running sutagent 1.14 (not 1.13 which the default scripts require/install).
Yes, I've been running sutagent 1.14. I applied the patches in bug 781341 and bug 797868 but now the verify.py fails and the panda doesn't get listed as active so I'm debugging that.
Attached file log from mochitests-8
Have gotten past the installapp and other test issues with recent test runs. This log is from a recent mochitest-8 run where there are other issues that are causing orange tests, Callek and I weren't sure of the cause.
Summary: installApp failure when running mochitests on new pandas → failures when running mochitests on new pandas
from the log: I//system/bin/fsck_msdos( 1283): Lost cluster chain at cluster 12342 I//system/bin/fsck_msdos( 1283): 1 Cluster(s) lost I//system/bin/fsck_msdos( 1283): FIXED I//system/bin/fsck_msdos( 1283): No space in LOST.DIR Can we check the space on the file system? I believe the sdcards have enough space, but maybe a partition or two is full or corrupt.
jmaher: The disk space was fine as I showed you in IRC. I was able to rerun the mochitests 8 again and it failed again with the same error message listed in the log. However, on some pandas this same test suite was green. In fact, we have had green builds on pandas 50, 54, 69 for mochitests 4, 6, 7 and 8. However, on other panda devices these same tests fail. I haven't had a successful test run for mochitests 1, 2, 3, and 5, but am currently running more tests.
Whiteboard: [re-panda]
This is great news. There could be specific tests that are failing on 1,2,3,5. If we have some pandas working with the toolchain and image, we need to start keeping notes about what is different with the failing ones. Maybe a different foopy, sdcard type, different chassis, etc... Even if we can get M4 up and running on tbpl, until we get more capacity, that will be a bit start in the right direction. Thanks again for the hard work on bringing these to life!
Looked at the test results from the past 12 hours we now have passing tests on all mochitest suites except 1 and 2. I made a list of pandas with failing tests versus passing tests and it seems that pandas are running successful tests on the same devices where they fail later on. I'm looking at the logs on the pandas to try to find a root cause, a lot of of them seem to fail at the verify_tegra_state.
So I looked at all the pandas this afternoon and these are the problems will not come up after a relay reboot chassis 2 23, 24, 27, 29, 32 chassis4 56, 57 chassis 5 65 Other issues: cannot connect to dataport, not accepting buildbot jobs etc. chassis 2 22, 26, 28, 31 chassis 4 pandas 46, 49-57 chassis 5 60, 61, 66, 67 I'll look at the second set of issues, but with the first set of issues where the pandas don't come up after a power cycle, I don't know what to do.
M1 is failing due to an OOM in bug 781792.
Depends on: 781792
Depends on: 805404
Depends on: 805420
Depends on: 811310
Depends on: 811428
Depends on: 811734
Depends on: 811791
Depends on: 821821
Removing block on bug 799698.
No longer blocks: 799698
Think this can be closed.
No longer blocks: android_4.0_testing
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: