Closed
Bug 895186
Opened 10 years ago
Closed 10 years ago
Run Android x86 emulator unit tests from buildbot
Categories
(Release Engineering :: General, defect, P2)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: gbrown, Assigned: armenzg)
References
Details
(Whiteboard: [reit-x86] summary in comment 193)
Attachments
(11 files, 32 obsolete files)
4.24 KB,
text/plain
|
Details | |
1.81 KB,
patch
|
rail
:
review+
Callek
:
checked-in+
|
Details | Diff | Splinter Review |
2.26 KB,
patch
|
rail
:
review+
Callek
:
checked-in+
|
Details | Diff | Splinter Review |
15.43 KB,
patch
|
mozilla
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
37.78 KB,
patch
|
mozilla
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
3.15 KB,
patch
|
Callek
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
25.07 KB,
patch
|
mozilla
:
review+
gbrown
:
feedback+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
2.03 KB,
patch
|
mozilla
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
2.14 KB,
patch
|
mozilla
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
3.75 KB,
patch
|
gbrown
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
10.92 KB,
patch
|
mozilla
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
Over in bug 891959, I am sorting out how we can run our unit tests in Android x86 emulators. :dminor handed off some excellent mozharness-based scripts to me; they run the tests, but I am not sure how buildbot will integrate with them, or what component will assign test jobs to emulators. This is a little preliminary, but we should probably start thinking about how all the pieces fit together. At this point I'm mostly looking for someone to work with me to plan a strategy for running these tests.
Comment 1•10 years ago
|
||
Armen, does this look likely you'd be able to coord with gbrown about this endeavor, or do we want to have a mini group meeting to figure out who has time?
Flags: needinfo?(armenzg)
Assignee | ||
Comment 2•10 years ago
|
||
I can make time to follow up with gbrown. gbrown: my ical is up-to-date, could you please pick a date and time to chat?
Flags: needinfo?(armenzg)
Assignee | ||
Comment 3•10 years ago
|
||
It seems that gbrown has some mozharness scripts and configs that he's going to get me and I can try to integrate it to my staging master with an iX box. We can't do this testing on EC2 due to some issues with OpenGL that crashed. Run times are a bit slower (20-30% slower) than on Pandas and Tegras.
Assignee: nobody → armenzg
Blocks: 891959
Assignee | ||
Comment 4•10 years ago
|
||
Adding needinfo to keep track that I need the scripts.
Flags: needinfo?(gbrown)
![]() |
Reporter | |
Updated•10 years ago
|
Flags: needinfo?(gbrown)
![]() |
Reporter | |
Comment 5•10 years ago
|
||
This is the python script I have been using to run tests in an emulator. I have been executing: androidx86_emulator_unittest.py --config <file> and specifying all options in the config file.
![]() |
Reporter | |
Comment 6•10 years ago
|
||
Sample config file: This runs mochitest-1 on emulator-5554.
![]() |
Reporter | |
Comment 7•10 years ago
|
||
See also bug 894507 for procedure and supporting scripts for setting up the emulator environment and launching the emulators.
![]() |
Reporter | |
Comment 8•10 years ago
|
||
Sample config file: This runs mochitest-1 on emulator-5554. This version also sets base_work_dir. If more than one script is running at one time, it is essential that each has a distinct base_work_dir.
Attachment #780533 -
Attachment is obsolete: true
Assignee | ||
Updated•10 years ago
|
Priority: -- → P3
Assignee | ||
Updated•10 years ago
|
Priority: P3 → P2
Assignee | ||
Comment 9•10 years ago
|
||
I'm currently trying to run this on a machine by applying the attached patch. python scripts/androidx86_emulator_unittest.py --config-file configs/android/androidx86.py --installer-url http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-central-android-x86/1375204094/fennec-25.0a1.en-US.android-i386.apk --test-url http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-central-android-x86/1375204094/fennec-25.0a1.en-US.android-i386.tests.zip --robocop-url http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-central-android-x86/1375204094/robocop.apk --download-symbols ondemand
Attachment #780531 -
Attachment is obsolete: true
Attachment #780541 -
Attachment is obsolete: true
Assignee | ||
Comment 10•10 years ago
|
||
gbrown, dminor, on which machines did you run these scripts on? What are their hostnames? I'm using a test machine called talos-linux64-ix-003 and adb is not installed. I'm not 100% if the lack of it is what is making me fail. I assume so. 13:56:02 INFO - ##### 13:56:02 INFO - ##### Running install step. 13:56:02 INFO - ##### 13:56:02 INFO - Running pre-action listener: _resource_record_pre_action 13:56:02 INFO - Running main action method: install Getting output from command: ['adb', '-s', 'emulator-5554', 'shell', 'date'] Copy/paste: adb -s emulator-5554 shell date 13:56:02 INFO - Running post-action listener: _resource_record_post_action 13:56:02 FATAL - Uncaught exception: Traceback (most recent call last): 13:56:02 FATAL - File "/home/cltbld/mozharness/mozharness/base/script.py", line 1048, in run 13:56:02 FATAL - self.run_action(action) 13:56:02 FATAL - File "/home/cltbld/mozharness/mozharness/base/script.py", line 990, in run_action 13:56:02 FATAL - self._possibly_run_method(method_name, error_if_missing=True) 13:56:02 FATAL - File "/home/cltbld/mozharness/mozharness/base/script.py", line 931, in _possibly_run_method 13:56:02 FATAL - return getattr(self, method_name)() 13:56:02 FATAL - File "scripts/androidx86_emulator_unittest.py", line 192, in install 13:56:02 FATAL - dh.install_app(self.installer_path) 13:56:02 FATAL - File "/home/cltbld/mozharness/mozharness/mozilla/testing/device.py", line 349, in install_app 13:56:02 FATAL - self.set_device_time() 13:56:02 FATAL - File "/home/cltbld/mozharness/mozharness/mozilla/testing/device.py", line 309, in set_device_time 13:56:02 FATAL - self.info(self.query_device_time()) 13:56:02 FATAL - File "/home/cltbld/mozharness/mozharness/mozilla/testing/device.py", line 300, in query_device_time 13:56:02 FATAL - "shell", "date"]) 13:56:02 FATAL - File "/home/cltbld/mozharness/mozharness/base/script.py", line 719, in get_output_from_command 13:56:02 FATAL - cwd=cwd, stderr=tmp_stderr, env=env) 13:56:02 FATAL - File "/usr/lib/python2.7/subprocess.py", line 679, in __init__ 13:56:02 FATAL - errread, errwrite) 13:56:02 FATAL - File "/usr/lib/python2.7/subprocess.py", line 1249, in _execute_child 13:56:02 FATAL - raise child_exception 13:56:02 FATAL - OSError: [Errno 2] No such file or directory 13:56:02 FATAL - Exiting -1 13:56:02 INFO - Running post-run listener: _resource_record_post_run [cltbld@talos-linux64-ix-003.test.releng.scl3.mozilla.com mozharness]$ source build/venv/bin/activate (venv)[cltbld@talos-linux64-ix-003.test.releng.scl3.mozilla.com mozharness]$ adb -s emulator-5554 shell date No command 'adb' found, did you mean: Command 'cdb' from package 'tinycdb' (main) Command 'gdb' from package 'gdb' (main) Command 'dab' from package 'bsdgames' (universe) Command 'zdb' from package 'zfs-fuse' (universe) Command 'kdb' from package 'elektra-bin' (universe) Command 'tdb' from package 'tads2-dev' (multiverse) Command 'pdb' from package 'python' (main) Command 'jdb' from package 'openjdk-6-jdk' (main) Command 'jdb' from package 'openjdk-7-jdk' (universe) Command 'ab' from package 'apache2-utils' (main) Command 'ad' from package 'netatalk' (universe) adb: command not found
![]() |
Reporter | |
Comment 11•10 years ago
|
||
I have been working on talos-linux64-ix-001.test.releng.scl3.mozilla.com. We installed the Android SDK there and added $SDK/tools and $SDK/platform-tools to PATH.
Assignee | ||
Comment 12•10 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #11) > I have been working on talos-linux64-ix-001.test.releng.scl3.mozilla.com. We > installed the Android SDK there and added $SDK/tools and $SDK/platform-tools > to PATH. I believe the right approach is to create the emulator snapshots inside of the current Android x86 builds and upload them to ftp. Then the test machines will download the avd files and start them up. Would this approach work for you? gbrown, could you please upload somewhere few avd files for me? I would like to verify steps 5 & 6 from bug 894507 and add mock support. On another note, could we meet on Tuesday? (I'm off on Monday) I'm finally having time to look at this and I can now ask more intelligent questions. I see two sides to this project: 1) generate the avd files on the build machines and upload them 2) download the avd files on the talos-linux64-ix machines and trigger the emulator jobs Note to self, I need to enable the Android x86 builds on Cedar and Ash.
Assignee | ||
Comment 13•10 years ago
|
||
It seems that the SDK packaging might have changed since your original setup. I'm trying this: wget http://dl.google.com/android/adt/adt-bundle-linux-x86_64-20130729.zip unzip adt-bundle-linux-x86_64-20130729.zip mv adt-bundle-linux-x86_64-20130729/sdk/ ~/android-sdk-linux export PATH=$PATH:/home/cltbld/android-sdk-linux/tools:/home/cltbld/android-sdk-linux/platform-tools cd ~/mozharness python scripts/androidx86_emulator_unittest.py --config-file configs/android/androidx86.py --installer-url http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-central-android-x86/1375204094/fennec-25.0a1.en-US.android-i386.apk --test-url http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-central-android-x86/1375204094/fennec-25.0a1.en-US.android-i386.tests.zip --robocop-url http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-central-android-x86/1375204094/robocop.apk --download-symbols ondemand Which of these avd files cltbld@talos-linux64-ix-001:~/gbrown are good for me to try? junk2-avd.tgz test-avd-8.tgz test-avds-4.tgz test-avds-5.tgz test-avds-6.tgz test-avds.tgz
![]() |
Reporter | |
Comment 14•10 years ago
|
||
Use this one: http://people.mozilla.org/~gbrown/test-avds.tgz Let's chat on Tuesday about the rest.
![]() |
Reporter | |
Comment 15•10 years ago
|
||
Similar to your patch, here are the changes I have been running with lately: - enable xpcshell tests - use hostutils.zip instead of xre.zip - simplify
Attachment #786149 -
Flags: review?(armenzg)
Assignee | ||
Comment 16•10 years ago
|
||
Comment on attachment 786149 [details] [diff] [review] misc x86 emulator changes Review of attachment 786149 [details] [diff] [review]: ----------------------------------------------------------------- Changing to feedback+ to prevent landing until we're ready. Would avoid the check-in get on the way for you? I can ask a review from aki once I'm comfortable that things are working all the way.
Attachment #786149 -
Flags: review?(armenzg) → feedback+
![]() |
Reporter | |
Comment 17•10 years ago
|
||
(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) (EDT/UTC-4) from comment #16) > Would avoid the check-in get on the way for you? That's fine. I thought it would be easier to "share" with you if I checked in -- whatever works for you is fine.
Assignee | ||
Comment 18•10 years ago
|
||
This patch brings our patches a little closer to each other but not quite. After I get things working, I will paste final patches.
Attachment #783351 -
Attachment is obsolete: true
Attachment #786149 -
Attachment is obsolete: true
Assignee | ||
Comment 19•10 years ago
|
||
Should this have worked? mkdir ~/.android/avd cd ~/.android/avd wget http://people.mozilla.org/~gbrown/test-avds.tgz tar zxvf test-avds.tgz wget -O launch.py https://bugzilla.mozilla.org/attachment.cgi?id=782839 $ python launch.py emulator: ERROR: This AVD's configuration is missing a kernel file!!
![]() |
Reporter | |
Comment 20•10 years ago
|
||
That's the right idea, but something is going wrong. I think the issue is that the avd definitions contain pointers to image files in the Android SDK. Probably the culprit is: /home/cltbld/android-sdk-linux/system-images/android-17/x86//kernel-qemu
Assignee | ||
Comment 21•10 years ago
|
||
FYI, I might be using a newer SDK. This is the version that I downloaded it: wget http://dl.google.com/android/adt/adt-bundle-linux-x86_64-20130729.zip I can try to find an older SDK to match what you used. [cltbld@talos-linux64-ix-003.test.releng.scl3.mozilla.com ~]$ ls -l /home/cltbld/android-sdk-linux/system-images/android-18/ total 4 drwxrwx--- 2 cltbld cltbld 4096 Jul 10 19:05 armeabi-v7a [cltbld@talos-linux64-ix-003.test.releng.scl3.mozilla.com ~]$ ls -l /home/cltbld/android-sdk-linux/system-images total 4 drwxr-x--- 3 cltbld cltbld 4096 Jul 29 15:23 android-18
Assignee | ||
Comment 22•10 years ago
|
||
I got a little further. Suggestions? [cltbld@talos-linux64-ix-003.test.releng.scl3.mozilla.com mozharness]$ python launch.py SDL init failure, reason is: No available video device
Assignee | ||
Comment 23•10 years ago
|
||
Setting the DISPLAY value helps.
> After a few minutes, launch.py will start 4 emulator instances and print the
> names and ports associated with each:
Is there a way to make it take less time? or know that it has not hung?
Assignee | ||
Comment 24•10 years ago
|
||
export PATH=$PATH:/home/cltbld/android-sdk-linux/tools:/home/cltbld/android-sdk-linux/platform-tools export DISPLAY=:0.0 cd ~/mozharness python launch.py (modified some printing) Should I worry about those WARNINGs? Do we need the sleeptime? [cltbld@talos-linux64-ix-003.test.releng.scl3.mozilla.com mozharness]$ python launch.py Launching emulator #0 Attemp #1 of SUT redirection test-x86-1: 5554; sut port:20701/20700 Sleeping 60 WARNING: Data partition already in use. Changes will not persist! WARNING: SD Card image already in use: /home/cltbld/.android/avd/test-x86-1.avd/sdcard.img WARNING: Cache partition already in use. Changes will not persist! Launching emulator #1 Attemp #1 of SUT redirection Attemp #2 of SUT redirection Attemp #3 of SUT redirection ^@Attemp #4 of SUT redirection ^@^@Attemp #5 of SUT redirection ^@^@^@^@^@^@^@^@^@^@Traceback (most recent call last): File "launch.py", line 45, in <module> proc = launchEmulatorByIndex(i) File "launch.py", line 37, in launchEmulatorByIndex redirectSUT(emuport, sutport1, sutport2) File "launch.py", line 23, in redirectSUT tn.read_until('OK') UnboundLocalError: local variable 'tn' referenced before assignment
Assignee | ||
Comment 25•10 years ago
|
||
How can I remove devices? [cltbld@talos-linux64-ix-003.test.releng.scl3.mozilla.com mozharness]$ adb devices List of devices attached emulator-5554 device
![]() |
Reporter | |
Comment 26•10 years ago
|
||
> Should I worry about those WARNINGs? Yes - they indicate that more than one emulator is running against the same image file...that should not be the case. > Do we need the sleeptime? Yes, I think so. I found that without those sleeps, there were intermittent failures to launch an emulator. > How can I remove devices? Kill the emulator: ps -ef | grep emu kill ...
Assignee | ||
Comment 27•10 years ago
|
||
I've removed all devices yet I get this: [cltbld@talos-linux64-ix-003.test.releng.scl3.mozilla.com mozharness]$ ps -ef | grep emu cltbld 3234 2920 0 14:48 pts/3 00:00:00 grep emu [cltbld@talos-linux64-ix-003.test.releng.scl3.mozilla.com mozharness]$ python launch.py Launching emulator #0 Attemp #1 of SUT redirection Socket Error [Errno 111] Connection refused Attemp #2 of SUT redirection Socket Error [Errno 111] Connection refused Attemp #3 of SUT redirection Socket Error [Errno 111] Connection refused ^Z [1]+ Stopped python launch.py [cltbld@talos-linux64-ix-003.test.releng.scl3.mozilla.com mozharness]$ ps -ef | grep emu cltbld 3236 3235 1 14:48 pts/3 00:00:03 /home/cltbld/android-sdk-linux/tools/emulator64-x86 -avd test-x86-1 -port 5554 cltbld 3250 2920 0 14:53 pts/3 00:00:00 grep emu
![]() |
Reporter | |
Comment 28•10 years ago
|
||
You can debug the telnet connection problem manually: Kill all emulators and verify adb devices shows nothing running. Then: $ emulator -avd test-x86-1 & $ adb devices List of devices attached emulator-5554 device $ telnet localhost 5554 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. Android Console: type 'help' for a list of commands OK redir add tcp:20701:20701 OK redir add tcp:20700:20700 OK quit Connection closed by foreign host.
![]() |
Reporter | |
Comment 29•10 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #20) > I think the issue is that the avd definitions contain pointers to image > files in the Android SDK. Probably the culprit is: > > /home/cltbld/android-sdk-linux/system-images/android-17/x86//kernel-qemu I created a new tar of avd images that includes kernel-qemu, along with the system.img and ramdisk.img: http://people.mozilla.org/~gbrown/test-avds-aug6.tgz With these images, you do not need the system images in the SDK. But, you need to launch the emulator slightly differently (specify paths to the kernel, etc on the emulator command line). I will update the launch.py on bug 894507.
Assignee | ||
Comment 30•10 years ago
|
||
Hi gbrown, I believe there's something at times not working properly in our machines. For some reason, I sometimes have trouble starting emulator #1. I've have noticed that if I run into it, VNC becomes unresponsive. I've also noticed that compiz starts running at 100% CPU and things only get back to normal after I kill it. After I killed it, I managed to start all emulators. What can I do if run into it again? What can use to debug it? Any log messages that I can check? These are the steps that I followed: cd ~/.android/avd rm -rf * wget http://people.mozilla.org/~gbrown/test-avds-aug6.tgz tar zxvf test-avds-aug6.tgz [1] rm test-avds-aug6.tgz cd export DISPLAY=:0.0 export PATH=$PATH:/home/cltbld/android-sdk-linux/tools:/home/cltbld/android-sdk-linux/platform-tools emulator -avd test-x86-1 & emulator -avd test-x86-2 & emulator -avd test-x86-3 & emulator -avd test-x86-4 & [cltbld@talos-linux64-ix-003.test.releng.scl3.mozilla.com ~]$ ps -ef | grep emu cltbld 3323 2917 20 09:16 pts/3 00:01:42 /home/cltbld/android-sdk-linux/tools/emulator64-x86 -avd test-x86-2 cltbld 3366 2917 24 09:18 pts/3 00:01:37 /home/cltbld/android-sdk-linux/tools/emulator64-x86 -avd test-x86-3 cltbld 3388 2917 23 09:18 pts/3 00:01:35 /home/cltbld/android-sdk-linux/tools/emulator64-x86 -avd test-x86-4 cltbld 3442 2917 34 09:21 pts/3 00:01:15 /home/cltbld/android-sdk-linux/tools/emulator64-x86 -avd test-x86-1 [cltbld@talos-linux64-ix-003.test.releng.scl3.mozilla.com ~]$ adb devices List of devices attached emulator-5554 device emulator-5556 device emulator-5558 device emulator-5560 device [1] [cltbld@talos-linux64-ix-003.test.releng.scl3.mozilla.com temp]$ ls -l ~/.android/avd/ total 283028 -rw-r--r-- 1 cltbld cltbld 2825664 Aug 6 15:19 kernel-qemu -rw-r--r-- 1 cltbld cltbld 270168 Aug 6 15:19 ramdisk.img -rw-r--r-- 1 cltbld cltbld 286691328 Aug 6 15:00 system.img drwxr-xr-x 2 cltbld cltbld 4096 Aug 6 15:24 test-x86-1.avd -rw-r--r-- 1 cltbld cltbld 120 Aug 6 15:24 test-x86-1.ini drwxr-xr-x 2 cltbld cltbld 4096 Aug 6 15:27 test-x86-2.avd -rw-r--r-- 1 cltbld cltbld 120 Aug 6 15:27 test-x86-2.ini drwxr-xr-x 2 cltbld cltbld 4096 Aug 6 15:27 test-x86-3.avd -rw-r--r-- 1 cltbld cltbld 120 Aug 6 15:27 test-x86-3.ini drwxr-xr-x 2 cltbld cltbld 4096 Aug 6 15:27 test-x86-4.avd -rw-r--r-- 1 cltbld cltbld 120 Aug 6 15:27 test-x86-4.ini
![]() |
Reporter | |
Comment 31•10 years ago
|
||
I saw those exact symptoms when I started using multiple emulators. The only way I could find to avoid it was to stagger the launches: sleep after launching each emulator. That's why there are those long sleep's in launch.py.
![]() |
Reporter | |
Comment 32•10 years ago
|
||
I don't know of a good way to debug/diagnose this problem.
Assignee | ||
Comment 33•10 years ago
|
||
What are the http and ssl ports supposed to be?
![]() |
Reporter | |
Comment 34•10 years ago
|
||
I have been using: "http_port": "8888", "ssl_port": "4445", for the first emulator, and incrementing for each subsequent: 8889/4446 for the second, etc. dminor -- is that right?
Flags: needinfo?(dminor)
Comment 35•10 years ago
|
||
I've been doing the same thing as Geoff. These are the ports set up by the test webserver, so any values that don't collide with something else running on the test system are fine.
Flags: needinfo?(dminor)
Assignee | ||
Comment 36•10 years ago
|
||
I've added an action called start-emulators. I hope tomorrow to run the mochitest suite to completion. gbrown, dminor: what should we do once the tests pass? Do we shut the emulators off with the "kill" command from within the telnet connection? Or should we reboot directly? Should I check if there are any emulators running at the beginning to I can shut them off before creating the new ones? This was run like this: python scripts/androidx86_emulator_unittest.py --config-file configs/android/androidx86.py --installer-url http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-central-android-x86/1375204094/fennec-25.0a1.en-US.android-i386.apk --test-url http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-central-android-x86/1375204094/fennec-25.0a1.en-US.android-i386.tests.zip --robocop-url http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-central-android-x86/1375204094/robocop.apk --download-symbols ondemand --test-suite mochitest
Attachment #786437 -
Attachment is obsolete: true
Assignee | ||
Comment 37•10 years ago
|
||
I'm trying to run mochitests manually but it is failing to run /system/bin/logcat -c. [cltbld@talos-linux64-ix-003.test.releng.scl3.mozilla.com mozharness]$ telnet localhost 20701 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. $>/system/bin/logcat -c ##AGENT-WARNING## [/system/bin/logcat] command with arg(s) = [-c] is currently not implemented. $>ver SUTAgentAndroid Version 1.18 $> export DISPLAY=:0.0 export PATH=$PATH:/home/cltbld/android-sdk-linux/tools:/home/cltbld/android-sdk-linux/platform-tools emulator -avd test-x86-1 & telnet localhost 5554 redir add tcp:20701:20701 redir add tcp:20700:20700 quit /home/cltbld/mozharness/build/venv/bin/python /home/cltbld/mozharness/build/tests/mochitest/runtestsremote.py --autorun --close-when-done --dm_trans=sut --console-level INFO --app org.mozilla.fennec --remote-webserver 10.0.2.2 --run-only-tests androidx86.json --xre-path /home/cltbld/mozharness/build/hostutils/xre --utility-path /home/cltbld/mozharness/build/hostutils/bin --deviceIP 127.0.0.1 --devicePort 20701 --http-port 8888 --ssl-port 4445 --httpd-path /home/cltbld/mozharness/build/tests/mochitest --total-chunks 8 --this-chunk 1 --symbols-path crashreporter-symbols.zip Device info: {'uptime': ['0 days 0 hours 6 minutes 41 seconds 181 ms'], 'sutuserinfo': ['User Serial:0'], 'power': ['Power status:', ' AC power ONLINE', ' Battery charge CHARGING', ' Remaining charge: 50%', ' Battery Temperature: 0.0 (c)'], 'process': [['10037', '1591', 'com.android.exchange'], ['10047', '1689', 'com.mozilla.SUTAgentAndroid'], ['10038', '1605', 'com.android.providers.calendar'], ['10027', '1646', 'com.android.calendar'], ['10033', '1265', 'com.android.systemui'], ['10018', '1362', 'com.android.inputmethod.latin'], ['10005', '1433', 'com.android.location.fused'], ['1000', '1171', 'system'], ['10022', '1570', 'com.android.deskclock'], ['10002', '1411', 'com.android.launcher'], ['10024', '1290', 'android.process.media'], ['1000', '1458', 'com.android.settings'], ['1001', '1390', 'com.android.phone'], ['10046', '1627', 'com.mozilla.watcher'], ['10015', '1535', 'com.android.mms'], ['10010', '1328', 'android.process.acore'], ['10010', '1501', 'com.android.contacts'], ['10025', '1485', 'com.android.music']], 'screen': ['X:1024 Y:720'], 'memory': ['PA:799289344, FREE: 665034752'], 'systime': ['2013/08/09 02:18:31:247'], 'rotation': ['ROTATION:0'], 'disk': ['/data: 610140160 total, 562380800 available', '/system: 277610496 total, 0 available', '/mnt/sdcard: 522225664 total, 518141952 available'], 'os': ['sdk_x86-eng 4.2 JOP40C eng.android-build.20121231.103448 test-keys'], 'id': ['52:54:00:12:34:56'], 'uptimemillis': ['401207'], 'temperature': ['Temperature: unknown']} Test root: /mnt/sdcard/tests Automation Error: Exception caught while running tests Traceback (most recent call last): File "/home/cltbld/mozharness/build/tests/mochitest/runtestsremote.py", line 688, in main dm.recordLogcat() File "/home/cltbld/mozharness/build/tests/mochitest/devicemanager.py", line 125, in recordLogcat self.shellCheckOutput(['/system/bin/logcat', '-c'], root=self._logcatNeedsRoot) File "/home/cltbld/mozharness/build/tests/mochitest/devicemanager.py", line 375, in shellCheckOutput raise DMError("Non-zero return code for command: %s (output: '%s', retval: '%s')" % (cmd, output, retval)) DMError: Non-zero return code for command: ['/system/bin/logcat', '-c'] (output: 'su: uid 10047 not allowed to su', retval: '1') Traceback (most recent call last): File "/home/cltbld/mozharness/build/tests/mochitest/runtestsremote.py", line 707, in <module> main() File "/home/cltbld/mozharness/build/tests/mochitest/runtestsremote.py", line 693, in main mochitest.stopWebServer(options) File "/home/cltbld/mozharness/build/tests/mochitest/runtestsremote.py", line 332, in stopWebServer self.server.stop() AttributeError: 'MochiRemote' object has no attribute 'server'
![]() |
Reporter | |
Comment 38•10 years ago
|
||
The important bit there is: su: uid 10047 not allowed to su It looks like su is not installed. Are you using my avd definitions, or your own?
Assignee | ||
Comment 39•10 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #38) > The important bit there is: > > su: uid 10047 not allowed to su > > It looks like su is not installed. Are you using my avd definitions, or your > own? I'm using this: http://people.mozilla.org/~gbrown/test-avds-aug6.tgz It comes inside of it, no?
![]() |
Reporter | |
Comment 40•10 years ago
|
||
Yes, su should be in system.img, in that tar. It works for me. Compare: mozdev@ubuntu:~/.android/avd$ emulator64-x86 -avd test-x86-2 -kernel kernel-qemu -system system.img -ramdisk ramdisk.img & [1] 14552 mozdev@ubuntu:~/.android/avd$ adb shell ls -l /system/xbin/su -rwsr-sr-x root root 17748 2013-08-05 03:14 su mozdev@ubuntu:~/.android/avd$ ls -l system.img -rw-r--r-- 1 mozdev mozdev 286691328 Aug 6 15:00 system.img mozdev@ubuntu:~/.android/avd$ adb shell su -c id uid=0(root) gid=0(root)
Comment 41•10 years ago
|
||
(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) (EDT/UTC-4) from comment #37) > I'm trying to run mochitests manually but it is failing to run > /system/bin/logcat -c. ... > $>ver > SUTAgentAndroid Version 1.18 > $> As an FYI SUTAgent 1.18 had issues in production and was backed out pending ATeam having cycles to look into it. we're currently using 1.17. No idea if that is affecting things here
![]() |
Reporter | |
Comment 42•10 years ago
|
||
(I am pretty comfortable with sut 1.18, at least on the emulator, but we should in mind that that is not the current version used elsewhere.) I removed the system-images from my Android SDK installation (android-sdk-linux). If your machine has a system image in the SDK, your emulator might be picking that up.
Assignee | ||
Comment 43•10 years ago
|
||
Attachment #788270 -
Flags: review?(bugspam.Callek)
Assignee | ||
Comment 44•10 years ago
|
||
I'm not happy with the for loop that I do but I did not feel like duplicating all of the dictionaries. This adds the x86 builds and x86 emulator tests to Ash. That branch is special because it allows us to push to http://hg.mozilla.org/users/asasaki_mozilla.com/ash-mozharness to test mozharness patches. aki, what do you think about the for loop? Would you take it? (No need to review the rest of the patch)
Attachment #788293 -
Flags: feedback?(aki)
Comment 45•10 years ago
|
||
Comment on attachment 788293 [details] [diff] [review] [wip] x86_bc.diff It's fine as long as we don't change the structure of the dicts... less ugly than some of the for loops we have :P I think I was used to this being a sql table, so the sql can be programmatic but you get a readable exploded version. I think we were investigating configconfig which would have a similar type of thing: a .py file to generate a verbose dict in some format (json? yaml?) We keep hitting this write-optimize vs read-optimize problem, so that might be a longer term solution.
Attachment #788293 -
Flags: feedback?(aki) → feedback+
Assignee | ||
Comment 46•10 years ago
|
||
gbrown, dminor: could we use this mozharness repo for now? That way we won't be based of different places. http://hg.mozilla.org/users/armenzg_mozilla.com/mozharness I will be pushing my changes there. I have this running in our staging environment. As soon as we get to a half decent state I will ask for reviews and enabled it on Ash. Which test suites have so far been successful?
Assignee | ||
Comment 47•10 years ago
|
||
Which test suites are we expected to run? The same ones as on Panda Android?
![]() |
Reporter | |
Comment 48•10 years ago
|
||
The full set of (Android) unit tests, like Panda, but including reftests and xpcshell tests: M1 M2 M3 M4 M5 M6 M7 M8 M-gl rc1 rc2 C J1 J2 J3 R1 R2 R3 R4 X AFAIK, we are not (yet) attempting Talos.
Comment 49•10 years ago
|
||
Comment on attachment 788270 [details] [diff] [review] x86_tools.diff Review of attachment 788270 [details] [diff] [review]: ----------------------------------------------------------------- reluctant r+ here. So you're not tying these builds to panda masters, and the bm19/20/22/10 set of masters are all going to be going away (they are on kvm) as well as I would prefer to not load them up with a new job type until after we get them to the new masters and balanced via slavealloc. That said, the aws masters you're marking are not yet officially in use, so I don't know if we need to do anything there first. And would be interested in which masters you actually plan to attach these jobs to and how. Finally, I trust you to figure this all out and the changes are technically sound so r+
Attachment #788270 -
Flags: review?(bugspam.Callek) → review+
Updated•10 years ago
|
Product: mozilla.org → Release Engineering
![]() |
Reporter | |
Comment 51•10 years ago
|
||
Comment on attachment 787765 [details] [diff] [review] [wip] integrated launch.py into mozharness Review of attachment 787765 [details] [diff] [review]: ----------------------------------------------------------------- ::: configs/android/androidx86.py @@ +12,5 @@ > + "min_http_port": "8888", # starting http port to use for the mochitest server > + "min_ssl_port": "4445", # starting ssl port to use for the server > + "min_emu_port": 5554, > + "min_sut_port1": 20701, > + "min_sut_port2" : 20700, # XXX: why do we have two ports? sutagent supports 2 ports: 20700 and 20701 by default. 20700 does not prompt for commands; 20701 does. As far as I know, our test automation only uses 20701, but I think it best to redirect both ports in case there is some usage I am not aware of, and to allow for future use. ::: scripts/androidx86_emulator_unittest.py @@ +38,4 @@ > "default": "browser", > "help": "The type of tests to run", > }], > + [["--robocop-url"], We can do without a robocop-specific option: See _download_robocop_apk() in android_panda.py. @@ +226,5 @@ > + self.info("Sleeping %d" % sleeptime) > + time.sleep(sleeptime) > + # XXX: what is this for? > + #for proc in procs: > + # proc.wait() launch.py had this just so that the launch script would wait for all of the emulators. This allowed you to Ctrl-C out of launch.py and kill all of the emulators at the same time. It's safe to remove from your mozharness patch.
Assignee | ||
Comment 52•10 years ago
|
||
Thanks for pointing that out. I believe this should meet your expectations.
Attachment #788270 -
Attachment is obsolete: true
Attachment #790394 -
Flags: review?(bugspam.Callek)
Assignee | ||
Updated•10 years ago
|
Summary: Determine how to run Android x86 emulator unit tests from buildbot → Run Android x86 emulator unit tests from buildbot
Updated•10 years ago
|
Whiteboard: [reit-x86]
Assignee | ||
Comment 53•10 years ago
|
||
Attachment #787765 -
Attachment is obsolete: true
Assignee | ||
Comment 54•10 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #31) > I saw those exact symptoms when I started using multiple emulators. The only > way I could find to avoid it was to stagger the launches: sleep after > launching each emulator. That's why there are those long sleep's in > launch.py. I don't know what to do. We have staggered start ups yet we get into this bad state. We have to find a way to determine why they would not start. This is a blocker for me.
Assignee | ||
Comment 55•10 years ago
|
||
I'm attaching the log (note this is not ready yet). I will be adding a way to stop the whole thing if all 4 emulators are not ready. I will also have to figure out how to deal with determining the robocop from the read-buildbot-configs step.
![]() |
Reporter | |
Comment 56•10 years ago
|
||
That log is puzzling to me. It looks like you are launching the emulators the same way that I do, but I never have a problem like that when the launches are staggered. Do you know if the 2nd, 3rd, and 4th emulators processes are still running when those telnet connections fail? The emulator does not write a log, as far as I can tell. However, it usually writes some messages to stdout and/or stderr. You could try collecting and reporting those. Also, you can more of those messages by specifying "-debug all" on the emulator command line.
![]() |
Reporter | |
Comment 57•10 years ago
|
||
Comment on attachment 791003 [details] [diff] [review] [wip] androidx86.mozharness.diff Review of attachment 791003 [details] [diff] [review]: ----------------------------------------------------------------- ::: scripts/androidx86_emulator_unittest.py @@ +239,5 @@ > + Let's make sure that every emulator has been stopped > + ''' > + for p in self.procs: > + if p.poll() is None: > + p.kill() I think this will work fine, but if you want another option: The emulator accepts a 'kill' command over telnet -- just like the 'redir' command used in _redirectSUT.
Assignee | ||
Comment 58•10 years ago
|
||
* adding -debug all * moved emulators parameters into the configs (the http port and ssl port need adjustment) * removed launch processed by index * removed "minimum" port concepts * using self.fatal() when an emulator fails to be connected to * added _post_fatal() function to kill the emulators (TODO) * start as many emulators as specified in config["emulators"] instead of xrange(0, 4)
Assignee | ||
Comment 59•10 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #56) > That log is puzzling to me. It looks like you are launching the emulators > the same way that I do, but I never have a problem like that when the > launches are staggered. > > Do you know if the 2nd, 3rd, and 4th emulators processes are still running > when those telnet connections fail? > > The emulator does not write a log, as far as I can tell. However, it usually > writes some messages to stdout and/or stderr. You could try collecting and > reporting those. Also, you can more of those messages by specifying "-debug > all" on the emulator command line. I think I was trying the scripts locally first, did not reboot and triggered a job with buildbot. This probably means that I had an emulator instance running. _post_fatal() works very well. I will post a new patch by the end of Monday.
Assignee | ||
Comment 60•10 years ago
|
||
FTR, I'm on duty this week and I might find it impossible to keep doing any development until Monday. It does not yet trigger unit test jobs on all four emulators. My latest work is in the attachment.
Assignee | ||
Updated•10 years ago
|
Attachment #791003 -
Attachment is obsolete: true
Assignee | ||
Comment 61•10 years ago
|
||
gbrown, I'm having trouble with staggered start up. The first one starts well. The second one does not. I've rebooted the host and will try again. Are 20701 and 20700 the default sut ports? Should I be redirecting the ports of the first emulator somewhere else? It seems that we don't redirect the default sut ports for the first emulator. 13:27:54 INFO - Trying to start the emulator with this command: emulator -avd test-x86-1 -debug all -port 5554 -kernel /home/cltbld/.android/avd/kernel-qemu -system /home/cltbld/.android/avd/system.img -ramdisk /home/cltbld/.android/avd/ramdisk.img 13:27:54 INFO - Sleeping 10 seconds 13:28:04 INFO - Attempt #1 to redirect ports: (5554, 20701, 20700) 13:28:04 INFO - test-x86-1: 5554; sut port: 20701/20700 13:28:04 INFO - Emulators staggered start up. Sleeping 60 secs. 13:29:04 INFO - Trying to start the emulator with this command: emulator -avd test-x86-2 -debug all -port 5556 -kernel /home/cltbld/.android/avd/kernel-qemu -system /home/cltbld/.android/avd/system.img -ramdisk /home/cltbld/.android/avd/ramdisk.img 13:29:04 INFO - Sleeping 10 seconds 13:29:14 INFO - Attempt #1 to redirect ports: (5556, 20703, 20702) 13:29:14 INFO - Trying again after exception: [Errno 111] Connection refused 13:29:14 INFO - Sleeping 30 seconds 13:29:44 INFO - Attempt #2 to redirect ports: (5556, 20703, 20702) 13:29:44 INFO - Trying again after exception: [Errno 111] Connection refused 13:29:44 INFO - Sleeping 30 seconds 13:30:14 INFO - Attempt #3 to redirect ports: (5556, 20703, 20702) 13:30:14 INFO - Trying again after exception: [Errno 111] Connection refused 13:30:14 INFO - Sleeping 30 seconds 13:30:44 INFO - Attempt #4 to redirect ports: (5556, 20703, 20702) 13:30:44 INFO - Trying again after exception: [Errno 111] Connection refused 13:30:44 INFO - Sleeping 30 seconds 13:31:14 INFO - Attempt #5 to redirect ports: (5556, 20703, 20702) 13:31:14 INFO - Trying again after exception: [Errno 111] Connection refused 13:31:14 FATAL - We have not been able to establish a telnetconnection with the emulator 13:31:14 FATAL - Running post_fatal callback... 13:31:14 INFO - Let's kill every process called emulator-x86 13:31:14 INFO - Killing pid 2917. 13:31:14 INFO - Killing pid 2951. 13:31:14 INFO - Copying logs to upload dir... 13:31:14 INFO - mkdir: /builds/slave/talos-slave/test/build/upload/logs 13:31:14 FATAL - Exiting -1
Assignee | ||
Comment 62•10 years ago
|
||
Killing compiz helped again.
Assignee | ||
Comment 63•10 years ago
|
||
I've created a new version of launch2.py to help me speed up the mozharness development. It pretty much matches what I already have in mozharness. This way I can focus only on running the tests once launch2.py runs well.
Assignee | ||
Comment 64•10 years ago
|
||
Assignee | ||
Comment 65•10 years ago
|
||
To make use of this script and get to where I'm you would have to do this: - in one session run launch2.py by setting PATH and DISPLAY -- two emulators should start - on another session set PATH and DISPLAY as well -- this will not be needed once I pass env values to ADBDeviceHandler - clone mozharness - apply mozharness patch - download the buildprops.json [1] - export PROPERTIES_FILE=`pwd`/buildprops.json - /tools/buildbot/bin/python scripts/scripts/androidx86_emulator_unittest.py --cfg android/androidx86.py --test-suite mochitest --download-symbols ondemand It currently fails on the install step. [1] https://bugzilla.mozilla.org/attachment.cgi?id=794874
Attachment #791458 -
Attachment is obsolete: true
![]() |
Reporter | |
Comment 66•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release Enginerring) (EDT/UTC-4) from comment #61) > gbrown, I'm having trouble with staggered start up. > The first one starts well. The second one does not. Can you collect the emulator -debug output from a failed run and post it here? I wonder if it has anything useful. > Are 20701 and 20700 the default sut ports? Yes. > Should I be redirecting the ports of the first emulator somewhere else? It > seems that we don't redirect the default sut ports for the first emulator. Keep in mind that redir add tcp:20701:20701 is *not* a no-op. It "redirects" port 20701 on the emulator to port 20701 on the host, so that "telnet 127.0.0.1 2070!" on the host connects to the sutagent, on the emulator. We want to issue a redir for every emulator, including the first one, and we just need the host ports to be unique. Currently, emulator 1 = {20700, 20701}, emulator 2 = {20702, 20703}, etc. > I've rebooted the host and will try again. > Killing compiz helped again. Just to be clear: Did you reboot, just kill compiz, or reboot + kill compiz?
Assignee | ||
Comment 67•10 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #66) > > I've rebooted the host and will try again. > > > Killing compiz helped again. > > Just to be clear: Did you reboot, just kill compiz, or reboot + kill compiz? The reboot was probably unnecessary. Killing compiz is what helped. How is compiz involved in this project? Is it fine if after a failure to connect I kill compiz?
Assignee | ||
Comment 68•10 years ago
|
||
rail: we're working on running 4 android x86 emulators on the talos-linux64-ix machines. We have noticed that at times, compiz starts using 100% CPU and we have to kill it otherwise we can't connect to the emulator. Callek pointed out to me that there's some puppet comments that mention that compiz can take 100% CPU. Is there something we can do on our side to prevent hitting this bug? Or do you have more context as to why it happens? Thanks! http://mxr.mozilla.org/build/source/puppet/modules/gui/manifests/init.pp#67 http://mxr.mozilla.org/build/source/puppet/modules/gui/templates/Xsession.conf.erb#11
Flags: needinfo?(rail)
Comment 69•10 years ago
|
||
> Is there something we can do on our side to prevent hitting this bug? In bug 859867 I landed attachment 747431 [details] [diff] [review] to prevent this, but it doesn't look like it helps... > Or do you have more context as to why it happens? Last I poked this, I thought that the problem is https://bugzilla.mozilla.org/show_bug.cgi?id=859867#c23 (nvidia drivers) I hope it helps.
Flags: needinfo?(rail)
Assignee | ||
Comment 70•10 years ago
|
||
:( 12:11 armenzg: hrmm talos-linux64-ix-003 rebooted on me 12:14 Callek: armenzg: rail-lunch: well this is interesting, syslog on -003 right now: 12:14 Callek: Aug 26 09:13:29 talos-linux64-ix-003 x-session-manager[2368]: WARNING: Application 'compiz.desktop' killed by sig 12:14 Callek: nal 12:14 Callek: Aug 26 09:13:29 talos-linux64-ix-003 x-session-manager[2368]: WARNING: App 'compiz.desktop' respawning too quickl 12:14 Callek: y 12:14 Callek: Aug 26 09:13:29 talos-linux64-ix-003 x-session-manager[2368]: CRITICAL: We failed, but the fail whale is dead. So 12:14 Callek: rry.... 12:14 Callek expects that was armen killing it ;-) 12:15 armenzg: Callek: I just killed the process 12:15 armenzg: like 10 seconds ago 12:15 armenzg: because I'm starting the emulators again 12:15 armenzg: and I knew that it would prevent them from starting 12:16 armenzg: I don't think I killed it fast enough 12:16 armenzg now kills emulator's pid 3007 12:16 Callek: armenzg: sooo looks like the system went down due to a kernel crash
![]() |
Reporter | |
Comment 71•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release Enginerring) (EDT/UTC-4) from comment #67) > (In reply to Geoff Brown [:gbrown] from comment #66) > > > I've rebooted the host and will try again. > > > > > Killing compiz helped again. > > > > Just to be clear: Did you reboot, just kill compiz, or reboot + kill compiz? > > The reboot was probably unnecessary. Killing compiz is what helped. > > How is compiz involved in this project? Is it fine if after a failure to > connect I kill compiz? We don't interact directly with compiz: none of the scripts start/stop compiz or anything like that. I have noticed that compiz is usually the "top" cpu user while running the emulators. I expect that it is fine to kill compiz when all of the emulators are stopped. I would be hesitant to kill compiz with an active emulator running tests. I don't have much insight into compiz. I wonder if :dminor has more info?
Flags: needinfo?(dminor)
Assignee | ||
Comment 72•10 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #71) > (In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release > Enginerring) (EDT/UTC-4) from comment #67) > > (In reply to Geoff Brown [:gbrown] from comment #66) > > > > I've rebooted the host and will try again. > > > > > > > Killing compiz helped again. > > > > > > Just to be clear: Did you reboot, just kill compiz, or reboot + kill compiz? > > > > The reboot was probably unnecessary. Killing compiz is what helped. > > > > How is compiz involved in this project? Is it fine if after a failure to > > connect I kill compiz? > > We don't interact directly with compiz: none of the scripts start/stop > compiz or anything like that. I have noticed that compiz is usually the > "top" cpu user while running the emulators. > > I expect that it is fine to kill compiz when all of the emulators are > stopped. I would be hesitant to kill compiz with an active emulator running > tests. > > I don't have much insight into compiz. I wonder if :dminor has more info? I have only been killing when the emulators are starting up. Once they have started up I have not had any trouble since I have not yet run any tests.
Comment 73•10 years ago
|
||
It might be worthwhile spending some time to get Ubuntu running without compiz. It isn't necessary and is causing problems. It should be possible to do something like this: http://askubuntu.com/questions/32447/how-do-i-disable-compiz-in-the-ubuntu-classic-session
Flags: needinfo?(dminor)
Comment 74•10 years ago
|
||
(In reply to Dan Minor [:dminor] from comment #73) > It might be worthwhile spending some time to get Ubuntu running without > compiz. It isn't necessary and is causing problems. It should be possible to > do something like this: > http://askubuntu.com/questions/32447/how-do-i-disable-compiz-in-the-ubuntu- > classic-session We thought about using XFCE in the beginning, but didn't go with this option because it doesn't represent an "average" Ubuntu user. Switching to something non-compiz may affect unit tests and talos results as well.
Comment 75•10 years ago
|
||
I thought there was going to be dedicated hardware for the Android x86 unit tests. If that is the case, then it should be safe to disable compiz for the machines running the emulators. If not, then I guess we are stuck killing it when it acts up.
Assignee | ||
Comment 76•10 years ago
|
||
I have been able two test suites. One on each emulator. Now, I have to run them concurrently rather than sequentially. aki, how does this format work for you? "test_suite_definitions": { "mochitest-1": { "args": [("--total-chunks", "8"), ("--this-chunk", "1")], "manifest": "androidx86.json", }, "mochitest-2": { "args": [("--total-chunks", "8"), ("--this-chunk", "2")], "manifest": "androidx86.json", }, }, I make use of it like this (--test-suite is an appendable list): ... self.test_suite_definitions = c['test_suite_definitions'] self.test_suites = c.get('test_suites') for suite in self.test_suites: assert suite in self.test_suite_definitions ... '--run-only-tests', self.test_suite_definitions[suite_name]["manifest"], ... for arg_pair in self.test_suite_definitions[suite_name]["args"]: cmd.extend(self._build_arg(arg_pair[0], arg_pair[1]))
Flags: needinfo?(aki)
Comment 77•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release Enginerring) (EDT/UTC-4) from comment #76) > I have been able two test suites. One on each emulator. > Now, I have to run them concurrently rather than sequentially. > > aki, how does this format work for you? > "test_suite_definitions": { > "mochitest-1": { > "args": [("--total-chunks", "8"), ("--this-chunk", "1")], > "manifest": "androidx86.json", > }, > "mochitest-2": { > "args": [("--total-chunks", "8"), ("--this-chunk", "2")], > "manifest": "androidx86.json", > }, > }, > > I make use of it like this (--test-suite is an appendable list): > ... > self.test_suite_definitions = c['test_suite_definitions'] > self.test_suites = c.get('test_suites') > for suite in self.test_suites: > assert suite in self.test_suite_definitions > ... > '--run-only-tests', self.test_suite_definitions[suite_name]["manifest"], > ... > for arg_pair in self.test_suite_definitions[suite_name]["args"]: > cmd.extend(self._build_arg(arg_pair[0], arg_pair[1])) Hm, not every argument is a pair. I think I leaned towards extra_args as a flat list because of this. Also, I'm not sure that --this-chunk will take multiple settings; it may just take the final one, so --test-suite mochitest-1 --test-suite mochitest-2 may only run chunk 2/8... though they're running in separate emulators? (Not entirely clear here.) Anyway, I would lean towards extra_args as a flat list, unless you have some specific reason for needing them in tuple pairs.
Flags: needinfo?(aki)
Assignee | ||
Comment 78•10 years ago
|
||
I'm running two different mochitest test suites on two different emulators :) Here are the steps to follow in case you want to reproduce locally. This is more accurate than comment 65. I had to update the host with Callek's latest deployment. * go in as root, apt-get update; apt-get install android-sdk18; - In one session run launch2.py and set PATH and DISPLAY -- two emulators should be started export PATH=$PATH:/tools/android-sdk18/tools:/tools/android-sdk18/platform-tools export DISPLAY=:0.0 wget -Olaunch2.py https://bugzilla.mozilla.org/attachment.cgi?id=794869 python launch2.py - on another session set PATH and DISPLAY as well and run the script -- this will not be needed once I pass env values to ADBDeviceHandler - clone mozharness - run the script http://hg.mozilla.org/users/armenzg_mozilla.com/mozharness scripts export PATH=$PATH:/tools/android-sdk18/tools:/tools/android-sdk18/platform-tools export DISPLAY=:0.0 /tools/buildbot/bin/python scripts/scripts/androidx86_emulator_unittest.py --cfg android/androidx86.py --download-symbols ondemand --installer-path /builds/slave/talos-slave/test/build/fennec-26.0a1.en-US.android-i386.apk --installer-url http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-central-android-x86/1377528472/en-US/fennec-26.0a1.en-US.android-i386.apk --test-url http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-central-android-x86/1377528472/en-US/fennec-26.0a1.en-US.android-i386.tests.zip --robocop-url http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-central-android-x86/1377528472/en-US/robocop.apk --test-suite mochitest-1 --test-suite mochitest-2
Attachment #794875 -
Attachment is obsolete: true
Updated•10 years ago
|
Attachment #794869 -
Attachment mime type: text/x-python → text/plain
Assignee | ||
Comment 79•10 years ago
|
||
This tries to follow what aki recommended wrt to process manipulation (use poll() instead of wait()). I still need to adjust it so we dump the log for each process that finishes. I will continue this on Friday.
Comment 80•10 years ago
|
||
This adds the android sdk to the ubuntu hosts.
Attachment #796385 -
Flags: review?(rail)
Comment 81•10 years ago
|
||
Comment on attachment 796385 [details] [diff] [review] [puppet] add android-tools Review of attachment 796385 [details] [diff] [review]: ----------------------------------------------------------------- ::: modules/androidemulator/manifests/init.pp @@ +7,5 @@ > + # We want it on Ubuntu > + include packages::mozilla::android_sdk18 > + } > + } > +} \ No newline at end of file \ No newline at end of file Can you add one? If you want to use a separate module for the SDK, I'd suggest to add CentOS here and fix http://hg.mozilla.org/build/puppet/file/618c178fd73e/modules/signingserver/manifests/base.pp#l31 ::: modules/toplevel/manifests/slave/test/gpu.pp @@ +6,4 @@ > gui: > on_gpu => true; > } > + extra trailing space ^ @@ +7,5 @@ > on_gpu => true; > } > + > + # Android Emulators only work on gpu slaves > + include android-emulator The name doesn't match the class name.
Attachment #796385 -
Flags: review?(rail) → review-
Comment 82•10 years ago
|
||
Attachment #796385 -
Attachment is obsolete: true
Attachment #796728 -
Flags: review?(rail)
Updated•10 years ago
|
Attachment #796728 -
Flags: review?(rail) → review+
Comment 83•10 years ago
|
||
Comment on attachment 796728 [details] [diff] [review] [puppet] add android-tools v2 https://hg.mozilla.org/build/puppet/rev/5d7c4c7f5d1a
Attachment #796728 -
Flags: checked-in+
Comment 84•10 years ago
|
||
So I'm not sold on this approach, and ideally we'll move this to a tooltool mechanic, even though nothing (I know of) uses tooltool on test machines yet. It will not change often, and deploying with puppet is meant as a "get it out". This patch assumes we'd blow-away and recreate each job run, however we could also decompress with puppet and run with the same avd's each time.
Attachment #796938 -
Flags: review?(rail)
Comment 85•10 years ago
|
||
Comment on attachment 796938 [details] [diff] [review] [puppet] deploy avd's Can you also document it at https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/Modules/config?
Attachment #796938 -
Flags: review?(rail) → review+
Comment 86•10 years ago
|
||
Comment on attachment 796938 [details] [diff] [review] [puppet] deploy avd's https://hg.mozilla.org/build/puppet/rev/9569852b09bd ...and documented https://wiki.mozilla.org/index.php?title=ReleaseEngineering%2FPuppetAgain%2FModules%2Fconfig&action=historysubmit&diff=700887&oldid=675263
Attachment #796938 -
Flags: checked-in+
Comment 87•10 years ago
|
||
Comment on attachment 796938 [details] [diff] [review] [puppet] deploy avd's backed out in https://hg.mozilla.org/build/puppet/rev/ac14ecf45840 due to: Wed Aug 28 20:11:27 -0700 2013 Puppet (err): Failed to apply catalog: Parameter source failed on File[/home/cltbld/avds/test-x86.tar.gz]: Cannot use URLs of type 'http' as source for fileserving at /etc/puppet/production/modules/androidemulator/manifests/x86.pp:21 I suspect I can work around it with using one of the many other ways to specify source => for this file but i'll test it tomorrow after I've rested.
Attachment #796938 -
Flags: checked-in+ → checked-in-
Comment 88•10 years ago
|
||
Comment on attachment 796938 [details] [diff] [review] [puppet] deploy avd's relanded using puppet:/// rather than http:// https://hg.mozilla.org/build/puppet/rev/54046d654252
Attachment #796938 -
Flags: checked-in- → checked-in+
Comment 89•10 years ago
|
||
Comment on attachment 796728 [details] [diff] [review] [puppet] add android-tools v2 Review of attachment 796728 [details] [diff] [review]: ----------------------------------------------------------------- ::: modules/androidemulator/manifests/init.pp @@ +5,5 @@ > + case $::operatingsystem { > + Ubuntu: { > + # We want it on Ubuntu > + include packages::mozilla::android_sdk18 > + } This should have a default line, or just remove the $::operatingsystem conditional and let the package class handle failing on other platforms.
Comment 90•10 years ago
|
||
Comment on attachment 796728 [details] [diff] [review] [puppet] add android-tools v2 Review of attachment 796728 [details] [diff] [review]: ----------------------------------------------------------------- ::: modules/packages/manifests/mozilla/android_sdk18.pp @@ +5,5 @@ > +class packages::mozilla::android_sdk18 { > + case $::operatingsystem { > + Ubuntu: { > + package { > + # Built from https://github.com/rail/android-sdk Sorry, one more thing on this ptach - like screenresolution, this should probably be moved to a Mozilla repo. It gets back to the still-unsolved problem of how to version Debian packaging scripts.
Comment 91•10 years ago
|
||
Comment on attachment 796938 [details] [diff] [review] [puppet] deploy avd's Review of attachment 796938 [details] [diff] [review]: ----------------------------------------------------------------- ::: modules/androidemulator/manifests/x86.pp @@ +18,5 @@ > + source => "http://${config::data_server}/repos/private/avds/test-x86-aug6.tar.gz", > + owner => $users::builder::username, > + group => $users::builder::group; > + } > + } Will this HTTP source cause the entire tarball, which is quite large, to be downloaded on every puppet run? That could be very painful. Could this be downloaded with tooltool instead, or wrapped in a .deb? At any rate, it isn't really a repository right now, so probably doesn't belong under repos/. Do these have to be installed in $HOME? We're not doing anything else in that directory anymore - could this be in /build or /tools instead? Finally, where does this tarball come from? How could an external user upgrade it, or a developer figure out what's in it, or a future relenger upgrade it to a newer version than aug6?
Comment 92•10 years ago
|
||
Oh, and why is $install_avds "yes"/"no" instead of boolean?
Assignee | ||
Comment 93•10 years ago
|
||
I have to do some more mozharness work with buildbot, hence, I had to create this patch to run Android x86 on my dev-master. What do you guys think of the builder naming? and the structure to define it? androidx86-set-# --> --test-suite jsreftest-1 --test-suite jsreftest-2 --test-suite jsreftest-3 I had to do that weird ANDROID_X86_MOZHARNESS_UNITTEST_DICT and ANDROID_X86_MOZHARNESS_UNITTEST_DICT dictionaries. It is ugly. Do you have any suggestions?
Attachment #798640 -
Flags: feedback?(bugspam.Callek)
Attachment #798640 -
Flags: feedback?(aki)
Assignee | ||
Comment 94•10 years ago
|
||
This is my latest work. I would only want to highlight the addition of the suite definitions for all of the test suites. This week I will be tackling: * put compiz under control * manage the avds appropriately * report status correctly
Attachment #796213 -
Attachment is obsolete: true
Attachment #796282 -
Attachment is obsolete: true
Assignee | ||
Comment 95•10 years ago
|
||
I'm getting such weird errors. I wonder what I've changed to get this. 09:26:10 INFO - One of the test suites have finished and we're going to dump its output 09:26:10 INFO - Reading from file /tmp/tmp1MnEMs 09:26:10 INFO - Traceback (most recent call last): 09:26:10 INFO - File "/builds/slave/talos-slave/test/build/tests/mochitest/runtestsremote.py", line 707, in <module> 09:26:10 INFO - main() 09:26:10 INFO - File "/builds/slave/talos-slave/test/build/tests/mochitest/runtestsremote.py", line 532, in main 09:26:10 INFO - dm = droid.DroidSUT(options.deviceIP, options.devicePort, deviceRoot=options.remoteTestRoot) 09:26:10 INFO - File "/builds/slave/talos-slave/test/build/tests/mochitest/devicemanagerSUT.py", line 47, in __init__ 09:26:10 INFO - self.getDeviceRoot() 09:26:10 INFO - File "/builds/slave/talos-slave/test/build/tests/mochitest/devicemanagerSUT.py", line 694, in getDeviceRoot 09:26:10 INFO - data = self._runCmds([{ 'cmd': 'testroot' }]) 09:26:10 INFO - File "/builds/slave/talos-slave/test/build/tests/mochitest/devicemanagerSUT.py", line 152, in _runCmds 09:26:10 INFO - self._sendCmds(cmdlist, outputfile, timeout, retryLimit=retryLimit) 09:26:10 INFO - File "/builds/slave/talos-slave/test/build/tests/mochitest/devicemanagerSUT.py", line 128, in _sendCmds 09:26:10 INFO - self._doCmds(cmdlist, outputfile, timeout) 09:26:10 INFO - File "/builds/slave/talos-slave/test/build/tests/mochitest/devicemanagerSUT.py", line 175, in _doCmds 09:26:10 INFO - self._sock.connect((self.host, int(self.port))) 09:26:10 INFO - File "/usr/lib/python2.7/socket.py", line 224, in meth 09:26:10 INFO - return getattr(self._sock,name)(*args) 09:26:10 INFO - TypeError: coercing to Unicode: need string or buffer, NoneType found
Attachment #791412 -
Attachment is obsolete: true
Assignee | ||
Updated•10 years ago
|
Attachment #788293 -
Attachment is obsolete: true
Assignee | ||
Comment 96•10 years ago
|
||
This has my latest mozharness code. - It kills compiz in advance - It does some basic avds manipulation (still needs to be tested) NOTE: I've found out that I need to kill compiz before trying to start any emulator (rather than try to kill it after an SUT timeout occurs).
Attachment #798649 -
Attachment is obsolete: true
Assignee | ||
Comment 97•10 years ago
|
||
Comment on attachment 798913 [details] androidx86.log.txt Ignore comment 95. I was doing something wrong.
Attachment #798913 -
Attachment is obsolete: true
Assignee | ||
Comment 98•10 years ago
|
||
Hi aki, I see that ADBDeviceHandler prints to stdout rather than using logging. http://hg.mozilla.org/build/mozharness/file/061c3d6c7b52/mozharness/mozilla/testing/device.py#l99 What should I do? Thanks!
Flags: needinfo?(aki)
Comment 99•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release Enginerring) (EDT/UTC-4) from comment #98) > Hi aki, > I see that ADBDeviceHandler prints to stdout rather than using logging. > http://hg.mozilla.org/build/mozharness/file/061c3d6c7b52/mozharness/mozilla/ > testing/device.py#l99 > > What should I do? Thanks! I don't see any print()s in that file. I'm going to guess this is the output spew from adb/devicemanager itself. I don't know what we can do about that other than * turn off output from devicemanager, which may hide issues * get devicemanager to accept a log object * try redirecting stdout/stderr itself, which may be an ugly solution but might work. * do what I did, which is let devicemanager spew to stdout, and generally ignore it.
Flags: needinfo?(aki)
Assignee | ||
Comment 100•10 years ago
|
||
Comment on attachment 798914 [details] [diff] [review] [wip] androidx86.mozharness.3.diff gbrown, dminor: could I please get your feedback on this patch? I'm getting close to completion and I would like to give you a day or so to give me your feedback. I would like to ask aki for a review with all of your concerns addressed first.
Attachment #798914 -
Flags: feedback?(gbrown)
Attachment #798914 -
Flags: feedback?(dminor)
Assignee | ||
Comment 101•10 years ago
|
||
This patch shows the changes from my last mozharness patch where I try to experiment how to report that final status. This probably will take a couple of days.
![]() |
Reporter | |
Comment 102•10 years ago
|
||
Comment on attachment 798914 [details] [diff] [review] [wip] androidx86.mozharness.3.diff Review of attachment 798914 [details] [diff] [review]: ----------------------------------------------------------------- Looking good! I am really looking forward to seeing this running via tbpl (on some tree), even if not everything works yet. ::: configs/android/androidx86.py @@ +20,5 @@ > + }, > + "default_actions": [ > + 'clobber', > + 'read-buildbot-config', > + #'setup-avds', Why is this commented out? @@ +42,5 @@ > + { > + "name": "test-x86-2", > + "device_id": "emulator-5556", > + "http_port": "8888", # starting http port to use for the mochitest server > + "ssl_port": "4445", # starting ssl port to use for the server Are you sure it is okay to have the same http_port and ssl_port for all of the instances? I have always used distinct ports...I don't know if this will be a problem or not. If they are all the same, maybe these parameters can be moved out of the per-emulator config? @@ +130,5 @@ > + "extra_args": [os.path.join('tests', 'testing', 'crashtest', 'crashtests.list')] > + }, > + "xpcshell": { > + "category": "xpcshell", > + "extra_args": ["--manifest", os.path.join('..','jsreftest', 'tests', 'jstests.list')] That's not right! The xpcshell manifest should be xpcshell/xpcshell_android.ini. @@ +134,5 @@ > + "extra_args": ["--manifest", os.path.join('..','jsreftest', 'tests', 'jstests.list')] > + }, > + "robocop-1": { > + "category": "mochitest", > + "extra_args": ["--robocop-path=.", "--robocop-ids=fennec_ids.txt", "--robocop=robocop.ini"], Missing total-chunks, this-chunk args. @@ +138,5 @@ > + "extra_args": ["--robocop-path=.", "--robocop-ids=fennec_ids.txt", "--robocop=robocop.ini"], > + }, > + "robocop-2": { > + "category": "mochitest", > + "extra_args": ["--robocop-path=.", "--robocop-ids=fennec_ids.txt", "--robocop=robocop.ini"], Ditto. ::: scripts/androidx86_emulator_unittest.py @@ +58,5 @@ > > error_list = [ > {'substr': 'FAILED (errors=', 'level': ERROR}, > {'substr': r'''Could not successfully complete transport of message to Gecko, socket closed''', 'level': ERROR}, > {'substr': 'Timeout waiting for marionette on port', 'level': ERROR}, There are a couple of errors left over from b2g here -- I'm sure they can be removed. @@ +329,5 @@ > + ''' > + This action starts the emulators and redirects the two SUT ports for each one of them > + ''' > + # XXX: This line is needed since I'm not rebootig the machine in between jobs > + self._kill_processes("emulator-x86") I thought "emulator" spawned "emulator64-x86" on 64 bit machines -- might be worth double checking the process names. @@ +395,5 @@ > + if procs == []: > + break > + else: > + self.info("#") > + time.sleep(30) I never like to see polling. Is this needed to avoid an output-driven timeout?
Attachment #798914 -
Flags: feedback?(gbrown) → feedback+
![]() |
Reporter | |
Comment 103•10 years ago
|
||
Also...I know earlier versions were not setting minidump_stackwalk_path correctly and I did not notice any changes in your patch. We should check on that.
Assignee | ||
Comment 104•10 years ago
|
||
Thanks gbrown for your catches (specially the ports - I lost the changes somewhere). I've fixed them and testing again.
Assignee | ||
Comment 105•10 years ago
|
||
> @@ +395,5 @@
> > + if procs == []:
> > + break
> > + else:
> > + self.info("#")
> > + time.sleep(30)
>
> I never like to see polling. Is this needed to avoid an output-driven
> timeout?
The printing of "#" is to avoid an output-driven timeout.
I needed to know when the process that triggers the tests finishes.
I wished I could use a callback or a similar mechanism but I didn't research more.
Comment 106•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release Enginerring) (EDT/UTC-4) from comment #105) > The printing of "#" is to avoid an output-driven timeout. You may want to sys.stdout.write('#') to avoid filling up the log with a timestamp + INFO + # every 30 seconds.
Assignee | ||
Updated•10 years ago
|
Attachment #794874 -
Attachment is obsolete: true
Comment 107•10 years ago
|
||
Comment on attachment 798640 [details] [diff] [review] [wip] androidx86.configs.diff (In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release Enginerring) (EDT/UTC-4) from comment #93) > What do you guys think of the builder naming? and the structure to define it? > androidx86-set-# --> --test-suite jsreftest-1 --test-suite jsreftest-2 > --test-suite jsreftest-3 I think that makes sense. I called groups of builds 'suites' elsewhere, but we already use 'suite' here so 'set' is fine. > I had to do that weird ANDROID_X86_MOZHARNESS_UNITTEST_DICT and > ANDROID_X86_MOZHARNESS_UNITTEST_DICT dictionaries. It is ugly. Do you have > any suggestions? I don't know about the names, but the current dict assumes we're running the test chunks individually, not parallelized on a single machine, so we do require a separate config for this. >diff --git a/mozilla-tests/BuildSlaves.py.template b/mozilla-tests/BuildSlaves.py.template >--- a/mozilla-tests/BuildSlaves.py.template >+++ b/mozilla-tests/BuildSlaves.py.template >@@ -1,43 +1,44 @@ >+ "ubuntu32": "pass", >+ "ubuntu64": "pass", >+ "ubuntu64-b2g": "pass", >- "tiger": "pass", >- "w764": "pass", >- "vista": "pass", >+ "linux64_android-x86": "pass", Did you intend to make these other changes? I think linux64_android-x86 is the only specifically applicable line here, right? I didn't see 'ubuntu32' in use anywhere; I didn't check on the others. Also, nit: lots of trailing whitespace in your mobile_config.py patch :)
Attachment #798640 -
Flags: feedback?(aki) → feedback+
Assignee | ||
Comment 108•10 years ago
|
||
I've dealt with all of your feedback. I've removed some of the noise of sorting platforms so staging_config.py and production_config.py match (I think that was my original reason). I'm also reusing the ubuntu64_hw instead. I've removed a bunch of trailing white spaces.
Attachment #798640 -
Attachment is obsolete: true
Attachment #798640 -
Flags: feedback?(bugspam.Callek)
Attachment #799702 -
Flags: review?(aki)
Assignee | ||
Comment 109•10 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #103) > Also...I know earlier versions were not setting minidump_stackwalk_path > correctly and I did not notice any changes in your patch. We should check on > that. What should it be? I see this output for the panda jobs: ./configs/android/android_panda_releng.py:MINIDUMP_STACKWALK_PATH = "/builds/minidump_stackwalk" ./configs/android/android_panda_releng.py: "minidump_stackwalk_path": MINIDUMP_STACKWALK_PATH, ./configs/android/android_panda_releng.py: "minidump_save_path": "%(abs_work_dir)s/../minidumps",
Flags: needinfo?(gbrown)
Assignee | ||
Comment 110•10 years ago
|
||
gbrown: should I delete ~/.android/avds and unpack clean templates before each run?
Assignee | ||
Comment 111•10 years ago
|
||
I've completed the coding as best as possible. Let me know if you prefer other ways to solve some of what I do. Would you mind if we landed this after you review it and iterate after that? I assume that we will need to see it running on tbpl and adjust as we see failures.
Attachment #798914 -
Attachment is obsolete: true
Attachment #799090 -
Attachment is obsolete: true
Attachment #798914 -
Flags: feedback?(dminor)
Attachment #799741 -
Flags: review?(aki)
Assignee | ||
Comment 112•10 years ago
|
||
Fixed some last minute typos.
Attachment #799741 -
Attachment is obsolete: true
Attachment #799741 -
Flags: review?(aki)
Attachment #799757 -
Flags: review?(aki)
![]() |
Reporter | |
Comment 113•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release Enginerring) (EDT/UTC-4) from comment #109) > (In reply to Geoff Brown [:gbrown] from comment #103) > > Also...I know earlier versions were not setting minidump_stackwalk_path > > correctly and I did not notice any changes in your patch. We should check on > > that. > > What should it be? > I see this output for the panda jobs: > ./configs/android/android_panda_releng.py:MINIDUMP_STACKWALK_PATH = > "/builds/minidump_stackwalk" > ./configs/android/android_panda_releng.py: "minidump_stackwalk_path": > MINIDUMP_STACKWALK_PATH, > ./configs/android/android_panda_releng.py: "minidump_save_path": > "%(abs_work_dir)s/../minidumps", I do not know where minidump_stackwalk lives officially, but there are some at http://mxr.mozilla.org/build/source/tools/breakpad/. We need minidump_stackwalk_path to point to the minidump_stackwalk binary (the file itself). So if we can get http://mxr.mozilla.org/build/source/tools/breakpad/linux64/minidump_stackwalk copied to /build/minidump_stackwalk and set minidump_stackwalk_path == "/build/minidump_stackwalk", that should work. minidump_save_path just needs to point to a directory (it can be empty) that .dmp files can be copied to - a temporary directory to hold dumps.
Flags: needinfo?(gbrown)
![]() |
Reporter | |
Comment 114•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release Enginerring) (EDT/UTC-4) from comment #110) > gbrown: should I delete ~/.android/avds and unpack clean templates before > each run? I think that would be best.
Comment 115•10 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #113) > (In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release > > What should it be? > > I see this output for the panda jobs: > > ./configs/android/android_panda_releng.py:MINIDUMP_STACKWALK_PATH = > > "/builds/minidump_stackwalk" > > ./configs/android/android_panda_releng.py: "minidump_stackwalk_path": > > MINIDUMP_STACKWALK_PATH, > > ./configs/android/android_panda_releng.py: "minidump_save_path": > > "%(abs_work_dir)s/../minidumps", > > I do not know where minidump_stackwalk lives officially, but there are some > at http://mxr.mozilla.org/build/source/tools/breakpad/. > > We need minidump_stackwalk_path to point to the minidump_stackwalk binary > (the file itself). So if we can get > http://mxr.mozilla.org/build/source/tools/breakpad/linux64/ > minidump_stackwalk copied to /build/minidump_stackwalk and set > minidump_stackwalk_path == "/build/minidump_stackwalk", that should work. > > minidump_save_path just needs to point to a directory (it can be empty) that > .dmp files can be copied to - a temporary directory to hold dumps. We don't need to stuff it in /builds like foopies, we do however want to be sure we use it. The binary is based on the host OS not target OS, so using the same one that Ubuntu test slaves use is just fine, we checkout tools as part of these tests aiui so we can just point at the location in our local tools repo.
Updated•10 years ago
|
Attachment #799702 -
Flags: review?(aki) → review+
Comment 116•10 years ago
|
||
Comment on attachment 799757 [details] [diff] [review] androidx86.mozharness.4.diff Awesome work, Armen! I'm pretty impressed you got parallel processes working... I'd love to see that as a generic helper object and logger, but that's definitely out of scope here. There are a lot of comments below. You can land after fixing, or I'm happy to re-review after changes. (In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release Enginerring) (EDT/UTC-4) from comment #111) > Would you mind if we landed this after you review it and iterate after that? > I assume that we will need to see it running on tbpl and adjust as we see > failures. That's ok. You might have to make some of these fixes to actually have this runnable, though. Pyflakes says: scripts/androidx86_emulator_unittest.py:10: 're' imported but unused scripts/androidx86_emulator_unittest.py:24: 'BaseErrorList' imported but unused scripts/androidx86_emulator_unittest.py:25: 'ERROR' imported but unused scripts/androidx86_emulator_unittest.py:375: undefined name 'c' scripts/androidx86_emulator_unittest.py:397: local variable 'joint_return_code' is assigned to but never used The undefined name 'c' is probably going to break things. >diff --git a/configs/android/androidx86.py b/configs/android/androidx86.py >new file mode 100644 >--- /dev/null >+++ b/configs/android/androidx86.py >@@ -0,0 +1,191 @@ >+import os >+ >+config = { >+ "buildbot_json_path": "buildprops.json", >+ "host_utils_url": "http://bm-remote.build.mozilla.org/tegra/tegra-host-utils.Linux.742597.zip", We may want this url in-tree at some point. Not a blocker. >+ "fennec_package_name": "org.mozilla.fennec", This will work for m-c level branches, but not as they ride the trains. I think there should be a text file inside the apk (package-name.txt?) that says what the app name is. (there is for android, not sure if that carried over to android-x86) Reading that file will make this work across trains, or for developers' builds, or on try, no matter which train the developer pushed from. We'll have to fix this before we can enable this on Aurora, or really have it be useful on Try. This doesn't block rolling out to Cedar, but we should probably fix before rolling out further. >+ "test_suite_definitions": { >+ "mochitest-1": { >+ "category": "mochitest", >+ "extra_args": ["--total-chunks", "8", "--this-chunk", "1", "--run-only-tests", "androidx86.json"], >+ }, <snip> >+ "suite_definitions": { >+ "mochitest": { >+ "run_filename": "runtestsremote.py", >+ "options": ["--autorun", "--close-when-done", "--dm_trans=sut", >+ "--console-level=INFO", "--app=%(app)s", "--remote-webserver=%(remote_webserver)s", >+ "--xre-path=%(xre_path)s", "--utility-path=%(utility_path)s", >+ "--deviceIP=%(device_ip)s", "--devicePort=%(device_port)s", >+ "--http-port=%(http_port)s", "--ssl-port=%(ssl_port)s", >+ "--certificate-path=%(certs_path)s", "--symbols-path=%(symbols_path)s" >+ ], >+ }, At some point we may want these in-tree, like talos.json. Again, not a blocker. >+sleeptime = 60 This might be a good thing to be able to configure, with a default of 60 (or whatever). >+ def _redirectSUT(self, emuport, sutport1, sutport2): >+ ''' >+ This redirects the default SUT ports for a given emulator. >+ This is needed if more than one emulator is started. >+ ''' This is interesting... temporary workaround or permanent solution? You might want a self.info() at the beginning, maybe not. >+ def _post_fatal(self, message=None, exit_code=None): >+ """ After we call fatal(), run this method before exiting. >+ """ >+ self._kill_processes("emulator64-x86") >+ >+ # XXX aki, I' not sure exactly what this block is for >+ if 'notify' in self.actions: >+ self.notify(message=message, fatal=True) >+ self.copy_logs_to_upload_dir() This is to send me email and save logs for the hg-git process. You don't need this block. >+ joint_return_code = 0 >+ while True: >+ for p in procs: >+ return_code = p["process"].poll() >+ if return_code!=None: >+ self.info("##### %s log begins" % p["suite_name"]) You may want to sys.stdout.write('\n') before this self.info(), so it doesn't show up on screen to the right of a million #'s. >+ if return_code !=0: >+ joint_return_code=1 I think you need to do something with this. > if __name__ == '__main__': > emulatorTest = Androidx86EmulatorTest() >- emulatorTest.run_and_exit() >+ emulatorTest.run() This should be run_and_exit().
Attachment #799757 -
Flags: review?(aki) → review+
Assignee | ||
Comment 117•10 years ago
|
||
Attachment #790394 -
Attachment is obsolete: true
Attachment #790394 -
Flags: review?(bugspam.Callek)
Attachment #800164 -
Flags: review?(bugspam.Callek)
Assignee | ||
Comment 118•10 years ago
|
||
So sad :( "Output exceeded 52428800 bytes, remaining output has been truncated" This happens when all of the emulators do not actually fail right away.
Assignee | ||
Comment 119•10 years ago
|
||
Updating the maxLogSize is very unwanted since it affects the performance of the masters. Plan to mitigate comment 118: - pull verbose test jobs into their own separate Androix86 test set (run 1 emulator job instead of 4) Plan to fix this issue (might not be implemented this time around): - do not output test results -- output only the test summary - upload test log with blobber - Tinderboxprint link to full log - Teach tbpl how to follow those URLs and parse those
Assignee | ||
Comment 120•10 years ago
|
||
Landed: https://hg.mozilla.org/build/mozharness/rev/3b926f407e76 I will follow up with all feedback from comment 114, comment 116 and the minidump related comments.
Updated•10 years ago
|
Attachment #800164 -
Flags: review?(bugspam.Callek) → review+
Assignee | ||
Comment 121•10 years ago
|
||
Comment on attachment 800164 [details] [diff] [review] androidx86.tools.diff - We add android-x86 on the linux masters instead of the mobile ones http://hg.mozilla.org/build/tools/rev/f2f79ce56851
Attachment #800164 -
Flags: checked-in+
Assignee | ||
Comment 122•10 years ago
|
||
Comment on attachment 799702 [details] [diff] [review] androidx86.configs.2.diff http://hg.mozilla.org/build/buildbot-configs/rev/2da9aa1732c5
Attachment #799702 -
Flags: checked-in+
Assignee | ||
Updated•10 years ago
|
Attachment #799757 -
Flags: checked-in+
Assignee | ||
Comment 123•10 years ago
|
||
Merged to production branch. Live in production.
Comment 124•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release Enginerring) (EDT/UTC-4) from comment #118) > So sad :( > > "Output exceeded 52428800 bytes, remaining output has been truncated" > > This happens when all of the emulators do not actually fail right away. You can also do the sleep(5), but only sys.stdout.write('#') if more than X amount of time has passed since the last one (60 seconds? 5min?)
Assignee | ||
Comment 125•10 years ago
|
||
> >+ def _redirectSUT(self, emuport, sutport1, sutport2): > >+ ''' > >+ This redirects the default SUT ports for a given emulator. > >+ This is needed if more than one emulator is started. > >+ ''' > > This is interesting... temporary workaround or permanent solution? > You might want a self.info() at the beginning, maybe not. > aki, what do you mean with self.info()? It is permanent. Each test job will talk to the pair of sut ports redirected for each emulator.
Flags: needinfo?(aki)
Assignee | ||
Comment 126•10 years ago
|
||
I can't see the jobs running on tbpl or buildapi even though in the "hidden builders" section I can see the androidx86 set jobs. https://tbpl.mozilla.org/?tree=Ash&rev=0d4ae6057ef5&jobname=Android.*&showall=1 https://secure.pub.build.mozilla.org/buildapi/self-serve/ash/rev/0d4ae6057ef5 I will wait a bit before filing a bug.
Comment 127•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release Enginerring) (EDT/UTC-4) from comment #125) > > >+ def _redirectSUT(self, emuport, sutport1, sutport2): > > >+ ''' > > >+ This redirects the default SUT ports for a given emulator. > > >+ This is needed if more than one emulator is started. > > >+ ''' > > > > This is interesting... temporary workaround or permanent solution? > > You might want a self.info() at the beginning, maybe not. > > > > aki, what do you mean with self.info()? It is permanent. Each test job will > talk to the pair of sut ports redirected for each emulator. self.info("Attempting to redirect ports for X to ...")
Flags: needinfo?(aki)
Assignee | ||
Comment 128•10 years ago
|
||
We can see the jobs in here: https://tbpl.mozilla.org/?tree=Ash&jobname=Android%20x86%20Emulator%20ash%20opt%20test%20androidx86-set The status reporting for TBPL it is not yet working properly.
Assignee | ||
Comment 129•10 years ago
|
||
(In reply to Aki Sasaki [:aki] from comment #127) > (In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release > Enginerring) (EDT/UTC-4) from comment #125) > > > >+ def _redirectSUT(self, emuport, sutport1, sutport2): > > > >+ ''' > > > >+ This redirects the default SUT ports for a given emulator. > > > >+ This is needed if more than one emulator is started. > > > >+ ''' > > > > > > This is interesting... temporary workaround or permanent solution? > > > You might want a self.info() at the beginning, maybe not. > > > > > > > aki, what do you mean with self.info()? It is permanent. Each test job will > > talk to the pair of sut ports redirected for each emulator. > > self.info("Attempting to redirect ports for X to ...") Does this do? http://hg.mozilla.org/build/mozharness/file/default/scripts/androidx86_emulator_unittest.py#l159 self.info(" Attempt #%d to redirect ports: (%d, %d, %d)" % \ (attempts, emuport, sutport1, sutport2))
Assignee | ||
Comment 130•10 years ago
|
||
I'm now going to be pushing changes to ash-mozharness instead of my own user repo. I've triggered a new set of jobs based on: http://hg.mozilla.org/users/asasaki_mozilla.com/ash-mozharness/rev/460926e7ed43 The results will show up on the second run of these: https://tbpl.mozilla.org/?tree=Ash&jobname=Android%20x86%20Emulator%20ash%20opt%20test%20androidx86-set
Comment 131•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release Enginerring) (EDT/UTC-4) from comment #129) > (In reply to Aki Sasaki [:aki] from comment #127) > > (In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release > > Enginerring) (EDT/UTC-4) from comment #125) > > > > >+ def _redirectSUT(self, emuport, sutport1, sutport2): > > > > >+ ''' > > > > >+ This redirects the default SUT ports for a given emulator. > > > > >+ This is needed if more than one emulator is started. > > > > >+ ''' > > > > > > > > This is interesting... temporary workaround or permanent solution? > > > > You might want a self.info() at the beginning, maybe not. > > > > > > > > > > aki, what do you mean with self.info()? It is permanent. Each test job will > > > talk to the pair of sut ports redirected for each emulator. > > > > self.info("Attempting to redirect ports for X to ...") > > Does this do? > http://hg.mozilla.org/build/mozharness/file/default/scripts/ > androidx86_emulator_unittest.py#l159 > self.info(" Attempt #%d to redirect ports: (%d, %d, %d)" % \ > (attempts, emuport, sutport1, sutport2)) Ah. Yes.
Assignee | ||
Comment 132•10 years ago
|
||
Results on tbpl: - m-2 crashes: https://tbpl.mozilla.org/php/getParsedLog.php?id=27485148&tree=Ash&full=1#error99 06:04:43 WARNING - PROCESS-CRASH | /tests/content/canvas/test/test_canvas.html | application crashed [Unknown top frame] gbrown, I would not look at other logs until I fix few more things. Feel free to look into the m-2 crash. It seems that I still need to set the MINIDUMP_STACKWALK correctly: > 06:04:43 INFO - MINIDUMP_STACKWALK not set, can't process dump.
Assignee | ||
Comment 133•10 years ago
|
||
- xpcshell is failing to 'mkdr /mnt/sdcard/tests'; err='Could not create the directory /mnt/sdcard/tests' https://tbpl.mozilla.org/php/getParsedLog.php?id=27487615&tree=Ash&full=1#error0 - The command used is: /builds/slave/talos-slave/test/build/venv/bin/python /builds/slave/talos-slave/test/build/tests/xpcshell/remotexpcshelltests.py --deviceIP=127.0.0.1 --xre-path=/builds/slave/talos-slave/test/build/hostutils/xre --testing-modules-dir=/builds/slave/talos-slave/test/build/tests/modules --apk=/builds/slave/talos-slave/test/build/fennec-26.0a1.en-US.android-i386.apk --no-logfiles --manifest xpcshell/xpcshell_android.ini
Assignee | ||
Comment 134•10 years ago
|
||
* robocop is failing with this: 07:36:49 INFO - Running on test-x86-3 the command /builds/slave/talos-slave/test/build/venv/bin/python /builds/slave/talos-slave/test/build/tests/mochitest/runtestsremote.py --autorun --close-when-done --dm_trans=sut --console-level=INFO --app=org.mozilla.fennec --remote-webserver=10.0.2.2 --xre-path=/builds/slave/talos-slave/test/build/hostutils/xre --utility-path=/builds/slave/talos-slave/test/build/hostutils/bin --deviceIP=127.0.0.1 --devicePort=20705 --http-port=8858 --ssl-port=4458 --certificate-path=/builds/slave/talos-slave/test/build/tests/certs --symbols-path=crashreporter-symbols.zip --total-chunks 2 --this-chunk 1 --robocop-path=. --robocop-ids=fennec_ids.txt --robocop=robocop.ini 07:37:19 INFO - ERROR: Unable to find robocop APK './robocop.apk' 07:37:19 INFO - ERROR: Invalid options specified, use --help for a list of valid options 07:37:19 INFO - ERROR: Unable to find robocop APK './robocop.apk' 07:37:19 INFO - ERROR: Invalid options specified, use --help for a list of valid options 07:37:19 INFO - TinderboxPrint: robocop-2<br/><em class="testfail">T-FAIL</em> I will adjust the robocop path but I wonder the invalid option is.
Assignee | ||
Comment 135•10 years ago
|
||
Summary of test results: - m-1 to m-4 are running [1]: TinderboxPrint: mochitest-2: T-FAIL CRASH TinderboxPrint: mochitest-1: 32373/1/63 TinderboxPrint: mochitest-4: 37567/3/200 TinderboxPrint: mochitest-3: 19809/4/55 - m-5 to m-8 are running [2]: TinderboxPrint: mochitest-7: 13070/10/1923 TinderboxPrint: mochitest-8: 73338/0/61 TinderboxPrint: mochitest-5: 39233/4/611 TinderboxPrint: mochitest-6: 12771/0/27 My own notes: - The TBPL status seems to work for set-1 and set-2 - We have some jobs retrying and I want to understand why - It seems that minidump is not quire there but I have made some progress 08:32:58 INFO - Crash dump filename: /tmp/tmpY_ydDv/242a7c9a-2af1-0a96-7cb91f82-77bfb3dc.dmp 08:32:58 INFO - MINIDUMP_STACKWALK binary not found: /talos-slave/test/build/venv/lib/python2.7/site-packages/talos/breakpad/linux64/minidump_stackwalk [1] https://tbpl.mozilla.org/php/getParsedLog.php?id=27491043&tree=Ash&full=1 [2] https://tbpl.mozilla.org/php/getParsedLog.php?id=27490746&tree=Ash&full=1
Comment 136•10 years ago
|
||
something here is in production
![]() |
Reporter | |
Comment 137•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release Enginerring) (EDT/UTC-4) from comment #133) > - xpcshell is failing to 'mkdr /mnt/sdcard/tests'; err='Could not create the > directory /mnt/sdcard/tests' > https://tbpl.mozilla.org/php/getParsedLog. > php?id=27487615&tree=Ash&full=1#error0 > - The command used is: /builds/slave/talos-slave/test/build/venv/bin/python > /builds/slave/talos-slave/test/build/tests/xpcshell/remotexpcshelltests.py > --deviceIP=127.0.0.1 > --xre-path=/builds/slave/talos-slave/test/build/hostutils/xre > --testing-modules-dir=/builds/slave/talos-slave/test/build/tests/modules > --apk=/builds/slave/talos-slave/test/build/fennec-26.0a1.en-US.android-i386. > apk --no-logfiles --manifest xpcshell/xpcshell_android.ini Oops - there is no --devicePort in that command, so it is probably running on 20701, which is already running a mochitest job!!
Assignee | ||
Comment 139•10 years ago
|
||
It seems that we get "command timed out: 2400 seconds without output, attempting to kill" either on the install step or on the run-tests for these two sets regardless if we do staggered_startup or not. https://tbpl.mozilla.org/php/getParsedLog.php?id=27499603&tree=Ash&full=1 https://tbpl.mozilla.org/php/getParsedLog.php?id=27499150&tree=Ash&full=1 FTR, sets 1, 2 & 4 have been succeeding with 0 seconds of staggered start ups: http://hg.mozilla.org/users/asasaki_mozilla.com/ash-mozharness/file/4509017b5815/configs/android/androidx86.py#l17 https://tbpl.mozilla.org/?tree=Ash&jobname=Android%20x86%20Emulator%20ash%20opt%20test%20androidx86-set-3 https://tbpl.mozilla.org/?tree=Ash&jobname=Android%20x86%20Emulator%20ash%20opt%20test%20androidx86-set-5
Assignee | ||
Comment 140•10 years ago
|
||
I will leave the other patch as the one for feedback since it addressed aki's concerns. This one will be the wip which I will ask review for.
Assignee | ||
Comment 141•10 years ago
|
||
Summary: * m-1 to m-8 are running well[1][2] ** gbrown will update androidx86.json to clear some test failures * m-2 is crashing consistently ** I think that if I fix the minidump_stackwalk properly it will help fix it [3] * The TBPL status seems to work for set-1 and set-2 * set-4 was the job that would normally RETRY ** It seems that it does not do it anymore after mochitest-gl stopped failing ** probably something in the output was triggering it * I sometimes see command timeout of 2400 secs during the run_tests step ** I don't know if it is because we use this "sys.stdout.write('#')" instead of "self.info('#') aki, if the timeout issues is not related to the usage of sys.stdout.write('#'), do you know if there is a safe way to see the last lines of each stdout before buildbot kills the job? [1] https://tbpl.mozilla.org/php/getParsedLog.php?id=27491043&tree=Ash&full=1 [2] https://tbpl.mozilla.org/php/getParsedLog.php?id=27490746&tree=Ash&full=1 [3] 08:32:58 INFO - Crash dump filename: /tmp/tmpY_ydDv/242a7c9a-2af1-0a96-7cb91f82-77bfb3dc.dmp 08:32:58 INFO - MINIDUMP_STACKWALK binary not found: /talos-slave/test/build/venv/lib/python2.7/site-packages/talos/breakpad/linux64/minidump_stackwalk
Comment 142•10 years ago
|
||
Comment on attachment 800865 [details] [diff] [review] androidx86.mozharness.diff Thanks for the pyflakes cleanup; that was annoying me. Getting this: scripts/androidx86_emulator_unittest.py:373: local variable 'dirs' is assigned to but never used > def _post_fatal(self, message=None, exit_code=None): > """ After we call fatal(), run this method before exiting. > """ > self._kill_processes("emulator64-x86") >- >- # XXX aki, I' not sure exactly what this block is for >- if 'notify' in self.actions: >- self.notify(message=message, fatal=True) > self.copy_logs_to_upload_dir() I don't think you have to copy the logs either. > def run_tests(self): > """ > Run the tests > """ > procs = [] > > emulator_index = 0 > for suite_name in self.test_suites: > procs.append(self._trigger_test(suite_name, emulator_index)) > emulator_index+=1 > >- joint_return_code = 0 >+ joint_tbpl_status = None >+ joint_log_level = None > while True: > for p in procs: > return_code = p["process"].poll() > if return_code!=None: If you're having problems timing out, I would put in debug output around both the procs.append above, and the 'for p in procs' here. I'm guessing something around here is hanging. If it helps to switch to self.info() instead of sys.stdout.write() that's fine, i'm just aware that that will be a) way more verbose and b) create a ton more lines of log. Still, you'll have timestamps and it's temporary. >+ # aki: do I need this? > # I'm not using the concept of "plain-#" like other jobs; do I need this logging? > # e.g. The mochitest suite: plain4 ran with return status: SUCCESS > #self.log("The %s suite: %s ran with return status: %s" % > # (suite_category, p["suite_name"], tbpl_status), level=log_level) > self.info("##### %s log ends" % p["suite_name"]) I'm not sure? Does your log have a good summary at the end?
Attachment #800865 -
Flags: feedback?(aki) → feedback+
Assignee | ||
Comment 143•10 years ago
|
||
gbrown, is this mochitest-gl cmd built correctly? 10:45:00 INFO - Running on test-x86-1 the command /builds/slave/talos-slave/test/build/venv/bin/python /builds/slave/talos-slave/test/build/tests/mochitest/runtestsremote.py --autorun --close-when-done --dm_trans=sut --console-level=INFO --app=org.mozilla.fennec --remote-webserver=10.0.2.2 --xre-path=/builds/slave/talos-slave/test/build/hostutils/xre --utility-path=/builds/slave/talos-slave/test/build/hostutils/bin --deviceIP=127.0.0.1 --devicePort=20701 --http-port=8854 --ssl-port=4454 --certificate-path=/builds/slave/talos-slave/test/build/tests/certs --symbols-path=crashreporter-symbols.zip --test-path content/canvas/test/webgl --run-only-tests androidx86.json What about xpcshell? 10:45:00 INFO - Running on test-x86-2 the command /builds/slave/talos-slave/test/build/venv/bin/python /builds/slave/talos-slave/test/build/tests/xpcshell/remotexpcshelltests.py --deviceIP=127.0.0.1 --devicePort=20703 --xre-path=/builds/slave/talos-slave/test/build/hostutils/xre --testing-modules-dir=/builds/slave/talos-slave/test/build/tests/modules --apk=/builds/slave/talos-slave/test/build/fennec-26.0a1.en-US.android-i386.apk --no-logfiles --manifest xpcshell/tests/xpcshell_android.ini
Flags: needinfo?(gbrown)
Assignee | ||
Comment 144•10 years ago
|
||
> What about xpcshell?
Ignore this last one, I have to look into an incorrect path:
10:51:00 INFO - IOError: Missing files: xpcshell/tests/xpcshell_android.ini
![]() |
Reporter | |
Comment 145•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Use needinfo flag) (Release Enginerring) (EDT/UTC-4) from comment #143) > gbrown, is this mochitest-gl cmd built correctly? That looks correct, except you should remove "--run-only-tests androidx86.json"
Flags: needinfo?(gbrown)
![]() |
Reporter | |
Comment 146•10 years ago
|
||
Relative paths always make me nervous. I would prefer to see an absolute path for content/canvas/test/webgl and xpcshell/tests/xpcshell_android.ini.
Assignee | ||
Comment 147•10 years ago
|
||
Added feedback from aki. Added full paths. I hope to have fixed the minidumps situation. Other fixes. I'm aiming to get xpcshell, jsreftest and reftests running by the end of Tuesday.
Attachment #800865 -
Attachment is obsolete: true
Attachment #800936 -
Attachment is obsolete: true
Comment 148•10 years ago
|
||
(In reply to Dustin J. Mitchell [:dustin] from comment #89) ...to comment 92 are all answered in Bug 913011
Assignee | ||
Comment 149•10 years ago
|
||
gbrown, can you please look at this? https://tbpl.mozilla.org/php/getParsedLog.php?id=27648536&tree=Ash&full=1#error0 09:41:43 INFO - REFTEST TEST-UNEXPECTED-FAIL | | HTTP ERROR : 404 09:41:43 INFO - REFTEST TEST-UNEXPECTED-FAIL | | EXCEPTION: Error 6 in manifest file http://10.0.2.2:8854//builds/slave/talos-slave/test/build/tests/reftest/tests/layout/reftests/reftest.list line 1 I also see that mochitest-gl has gone a bit further: https://tbpl.mozilla.org/php/getParsedLog.php?id=27598902&tree=Ash&full=1#error16 but not a clean run: 15:08:58 INFO - Traceback (most recent call last): 15:08:58 INFO - File "/builds/slave/talos-slave/test/build/tests/mochitest/runtestsremote.py", line 689, in main 15:08:58 INFO - retVal = mochitest.runTests(options) 15:08:58 INFO - File "/builds/slave/talos-slave/test/build/tests/mochitest/runtests.py", line 624, in runTests 15:08:58 INFO - self.cleanup(manifest, options) 15:08:58 INFO - File "/builds/slave/talos-slave/test/build/tests/mochitest/runtestsremote.py", line 243, in cleanup 15:08:58 INFO - if self._dm.fileExists(self.remoteLog): 15:08:58 INFO - File "/builds/slave/talos-slave/test/build/tests/mochitest/devicemanagerSUT.py", line 404, in fileExists 15:08:58 INFO - return filename in self.listFiles(containingpath) 15:08:58 INFO - File "/builds/slave/talos-slave/test/build/tests/mochitest/devicemanagerSUT.py", line 408, in listFiles 15:08:58 INFO - if not self.dirExists(rootdir): 15:08:58 INFO - File "/builds/slave/talos-slave/test/build/tests/mochitest/devicemanagerSUT.py", line 389, in dirExists 15:08:58 INFO - ret = self._runCmds([{ 'cmd': 'isdir ' + remotePath }]).strip() 15:08:58 INFO - File "/builds/slave/talos-slave/test/build/tests/mochitest/devicemanagerSUT.py", line 152, in _runCmds 15:08:58 INFO - self._sendCmds(cmdlist, outputfile, timeout, retryLimit=retryLimit) 15:08:58 INFO - File "/builds/slave/talos-slave/test/build/tests/mochitest/devicemanagerSUT.py", line 134, in _sendCmds 15:08:58 INFO - raise err 15:08:58 INFO - DMError: Automation Error: Timeout in command isdir /mnt/sdcard/tests/logs 15:08:58 INFO - Automation Error: Exception caught while running tests I will trigger a new fresh set of builds based on: https://tbpl.mozilla.org/?tree=Mozilla-Inbound&jobname=Android&rev=074ec56640f6 Thanks!
Flags: needinfo?(gbrown)
![]() |
Reporter | |
Comment 150•10 years ago
|
||
"My" reftests run much better -- but currently with lots of errors! -- with: /home/cltbld/tests/scripts/scripts/build/venv/bin/python remotereftest.py --app=org.mozilla.fennec --ignore-window-size --remote-webserver 10.0.2.2 --xre-path /home/cltbld/tests/scripts/scripts/build/hostutils/xre --utility-path /home/cltbld/tests/scripts/scripts/build/hostutils/bin --deviceIP 127.0.0.1 --devicePort 20701 --http-port 8888 --ssl-port 4445 --httpd-path reftest/components --total-chunks 10 --this-chunk 1 --symbols-path crashreporter-symbols.zip tests/layout/reftests/reftest.list Compared to "your" reftests: /builds/slave/talos-slave/test/build/venv/bin/python /builds/slave/talos-slave/test/build/tests/reftest/remotereftest.py --app=org.mozilla.fennec --ignore-window-size --remote-webserver=10.0.2.2 --xre-path=/builds/slave/talos-slave/test/build/hostutils/xre --utility-path=/builds/slave/talos-slave/test/build/hostutils/bin --deviceIP=127.0.0.1 --devicePort=20701 --http-port=8854 --ssl-port=4454 --httpd-path reftest/components --total-chunks 4 --this-chunk 1 /builds/slave/talos-slave/test/build/tests/reftest/tests/layout/reftests/reftest.list The only significant difference I see is your full path for reftest.list -- in this case, I think we need a relative path. Also, I found that reftests run much slower on emulator -- I expect you need at least 10 chunks to avoid 60-minute timeouts.
Flags: needinfo?(gbrown)
![]() |
Reporter | |
Comment 151•10 years ago
|
||
More evidence for that relative path... My log has lines like: INFO - REFTEST TEST-START | http://10.0.2.2:8888/tests/layout/reftests/reftest-sanity/test-async.xul Compared to: INFO - REFTEST TEST-UNEXPECTED-FAIL | | EXCEPTION: Error 6 in manifest file http://10.0.2.2:8854//builds/slave/talos-slave/test/build/tests/reftest/tests/layout/reftests/reftest.list line 1 This isn't going to work: http://10.0.2.2:8854//builds/... ^^
Assignee | ||
Comment 152•10 years ago
|
||
Switching back to relative paths. We're testing the new inbound builds. We have the tbpl patches live. It's not been a very productive day. I hope to complete tomorrow what I had hoped to complete today.
Attachment #801806 -
Attachment is obsolete: true
Assignee | ||
Comment 153•10 years ago
|
||
I also split the reftest chunks into 10. At least the 4 that were mentioned on the buildbot-configs. https://tbpl.mozilla.org/?tree=Ash&jobname=Android 4.2 x86&rev=1c67140bc6a3 - The second the set of jobs will be using the build from m-i (as per comment 149)
Assignee | ||
Comment 154•10 years ago
|
||
Status summary: * m-[1-8] are running green [1][2] * xpcshell is running green [3] * mochitest-gl is running but it fails [3] * robocop-{1,2} are running but 2 tests fail [3] * reftests should be running before the end of the day [1] https://tbpl.mozilla.org/php/getParsedLog.php?id=27700133&tree=Ash&full=1 [2] https://tbpl.mozilla.org/php/getParsedLog.php?id=27700349&tree=Ash&full=1 [3] https://tbpl.mozilla.org/php/getParsedLog.php?id=27705576&tree=Ash&full=1 https://tbpl.mozilla.org/?tree=Ash&jobname=Android%204.2%20x86&rev=1c67140bc6a3
Assignee | ||
Comment 155•10 years ago
|
||
This is very close. I need to see set 3 & 5 not timing out and that should be mainly it.
Attachment #802585 -
Attachment is obsolete: true
Attachment #803294 -
Flags: review?(aki)
Attachment #803294 -
Flags: feedback?(gbrown)
Assignee | ||
Comment 156•10 years ago
|
||
These are the branches where it would be enabled: Android 4.2 x86 Emulator alder opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator ash opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator b2g-inbound opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator birch opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator build-system opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator cedar opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator cypress opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator elm opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator fig opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator fx-team opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator graphics opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator gum opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator holly opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator ionmonkey opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator jamun opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator larch opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator maple opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator mozilla-central opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator mozilla-inbound opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator oak opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator pine opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator profiling opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator services-central opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator try opt test androidx86-set-1 ScriptFactory Android 4.2 x86 Emulator ux opt test androidx86-set-1 ScriptFactory
Attachment #803295 -
Flags: review?(aki)
Comment 157•10 years ago
|
||
Comment on attachment 803295 [details] [diff] [review] androidx86.configs.diff Why TEMP ?
Attachment #803295 -
Flags: review?(aki) → review+
Assignee | ||
Comment 158•10 years ago
|
||
gbrown, could you please have a look at these reftests, crashtest and jsreftests results? https://tbpl.mozilla.org/php/getParsedLog.php?id=27726116&tree=Ash&full=1 https://tbpl.mozilla.org/php/getParsedLog.php?id=27726080&tree=Ash&full=1
Flags: needinfo?(gbrown)
Comment 159•10 years ago
|
||
Comment on attachment 803294 [details] [diff] [review] androidx86.mozharness.diff Good job! And I'm really happy to see the pyflakes warnings go away too. > def setup_avds(self): > ''' > We have deployed through Puppet tar ball with the pristine templates. >- If they have not been untarred before we go ahead and do so. >+ Let's unpack them every time. > ''' >- if not os.path.exists(os.path.join(self.config[".avds_dir"], "test-x86-1.avd")): >- avds_path = self.config["avds_path"] >- self.mkdir_p(self.config[".avds_dir"]) >- self.unpack(avds_path, self.config[".avds_dir"]) >+ if os.path.exists(os.path.join(self.config[".avds_dir"], "test-x86-1.avd")): >+ shutil.rmtree(self.config[".avds_dir"]) self.rmtree?
Attachment #803294 -
Flags: review?(aki) → review+
Assignee | ||
Comment 160•10 years ago
|
||
Comment on attachment 803294 [details] [diff] [review] androidx86.mozharness.diff Landed so we can see nicer results on Cedar. Any follow up feedback or fixes will come in new patch to address them separately.
Attachment #803294 -
Flags: checked-in+
Assignee | ||
Comment 161•10 years ago
|
||
(In reply to Aki Sasaki [:aki] from comment #157) > Comment on attachment 803295 [details] [diff] [review] > androidx86.configs.diff > > Why TEMP ? I can leave it as it was to match the naming of other dictionaries in the file. I have pet peeve with naming things very similar to other variables. Summary: > * m-[1-8] are running green [1][2] > * xpcshell is running green [3] > * mochitest-gl is running but it fails [3] ** gbrown to look into it > * robocop-{1,2} are running but 2 tests fail [3] * reftests, crashtest and jsreftests can run but FAIL [4][5] ** gbrown to look into it * once mozharness is merged to production, we will be able to see the x86 jobs run as I mention in this summary * I would like to wait until Cedar is green before we enable them across the board ** blassey works for you? (Or add voluntold to go to every tree to hide/show jobs as they green out) > [1] https://tbpl.mozilla.org/php/getParsedLog.php?id=27700133&tree=Ash&full=1 > [2] https://tbpl.mozilla.org/php/getParsedLog.php?id=27700349&tree=Ash&full=1 > [3] https://tbpl.mozilla.org/php/getParsedLog.php?id=27705576&tree=Ash&full=1 > [4] https://tbpl.mozilla.org/php/getParsedLog.php?id=27726116&tree=Ash&full=1 [5] https://tbpl.mozilla.org/php/getParsedLog.php?id=27726080&tree=Ash&full=1
Flags: needinfo?(blassey.bugs)
Whiteboard: [reit-x86] → [reit-x86] summary in comment 161
Comment 162•10 years ago
|
||
voluntold?? do you have any gut feeling for how long it'll take to green cedar up?
Flags: needinfo?(blassey.bugs)
Assignee | ||
Comment 163•10 years ago
|
||
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #162) > voluntold?? > Volunteer + told :P nvm. ignore me :) > do you have any gut feeling for how long it'll take to green cedar up? I can only get things as I mentioned on comment 161. I will work *today/tomorrow* towards getting Cedar in such state (merge mozharness to production + trigger new set of builds) mochitest-gl, reftests, crashtest and jsreftests are as of now out of my hand.
Assignee | ||
Comment 164•10 years ago
|
||
Merged to production and triggered new set of builds on Cedar. Results to be found in here: https://tbpl.mozilla.org/?tree=Cedar&jobname=Android 4.2 x86&rev=e6f8b77a8824
![]() |
Reporter | |
Comment 165•10 years ago
|
||
Comment on attachment 803294 [details] [diff] [review] androidx86.mozharness.diff Review of attachment 803294 [details] [diff] [review]: ----------------------------------------------------------------- This looks fine. I am investigating the remaining test failures. We may need to add some chunks or introduce another x86-specific manifest, but other than that, I don't expect more harness changes.
Attachment #803294 -
Flags: feedback?(gbrown) → feedback+
Assignee | ||
Comment 166•10 years ago
|
||
For some odd reason, Cedar is not reporting the results that I expected. I will look into it while gbrown looks into the issues that I reported earlier. I hope to have some fixes landed before EOD tomorrow.
![]() |
Reporter | |
Comment 167•10 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=27772974&tree=Ash&full=1#error49 has a crash dump (good) but it lacks symbols (no file names, line numbers -- bad). I am not sure where this is going wrong. I think the harness should be invoking "minidump_stackwalk <dmp file> <symbols dir>". I have a bad feeling that we are passing crashreporter-symbols.zip as <symbols dir>, instead of unpacking crashreporter-symbols.zip to a directory and passing that directory name to minidump_stackwalk.
Assignee | ||
Comment 168•10 years ago
|
||
Somewhere down the line I failed to bring this part of my code into the review.
Attachment #804437 -
Flags: review?(aki)
Comment 169•10 years ago
|
||
Comment on attachment 804437 [details] [diff] [review] add def worst_tbpl_status() Hm, I use self.tbpl_status = self.worst_level(TBPL_WARNING, self.tbpl_status, levels=TBPL_WORST_LEVEL_TUPLE) http://hg.mozilla.org/build/mozharness/file/a660ae1a633f/mozharness/mozilla/testing/unittest.py#l135 This seems to be dup code? I'm fine with having this method, but maybe it should call self.worst_level().
Comment 170•10 years ago
|
||
Comment on attachment 804437 [details] [diff] [review] add def worst_tbpl_status() Minusing for now, due to comment 169.
Attachment #804437 -
Flags: review?(aki) → review-
Assignee | ||
Comment 171•10 years ago
|
||
Running on Ash. I will ask for review when I see them complete.
Attachment #804437 -
Attachment is obsolete: true
Assignee | ||
Comment 172•10 years ago
|
||
I asked on IRC but just in case our day is over and can't get back to me through IRC I'm in the process of having Cedar match comment 161. Probably before EOD. gbrown, what would you prefer me to help with? * bug 915870 - make sure that our funky builder naming works with trychooser * help with the minidump issue - comment 167 On another note, do we need to split reftests even more? (currently running 10 chunks) or bump a timeout inside of the *test* harness? (not mozharness)
![]() |
Reporter | |
Comment 173•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4) from comment #172) > gbrown, what would you prefer me to help with? > * bug 915870 - make sure that our funky builder naming works with trychooser > * help with the minidump issue - comment 167 The minidump issue please. > On another note, do we need to split reftests even more? (currently running > 10 chunks) or bump a timeout inside of the *test* harness? (not mozharness) I think we will need to split reftests more, but I'm not sure how much. We are also considering running with skia-gl disabled -- some discussion in bug 907351 -- so I am running some special tests to see how much of a difference that makes. I'll try to get back to you with a recommendation for # chunks by Monday morning.
Flags: needinfo?(gbrown)
Assignee | ||
Comment 174•10 years ago
|
||
Comment on attachment 804623 [details] [diff] [review] tbpl's worst status - v2 It works!
Attachment #804623 -
Flags: review?(aki)
Assignee | ||
Comment 175•10 years ago
|
||
This is wip. The theory is that you can pass to --symbols-path either a path or a URL and the test harnesses take care of it. I see that we have "download-symbols" set to be "ondemand" > 'download_symbols': 'ondemand' which causes us to set self.symbols_path to be self.symbols_url http://hg.mozilla.org/build/mozharness/file/production/mozharness/mozilla/testing/testbase.py#l250 Let's see if we get a crash in https://tbpl.mozilla.org/?tree=Ash&jobname=Android%204.2%20x86&rev=82db508f2304
Updated•10 years ago
|
Attachment #804623 -
Flags: review?(aki) → review+
Assignee | ||
Comment 176•10 years ago
|
||
Comment on attachment 804623 [details] [diff] [review] tbpl's worst status - v2 https://hg.mozilla.org/build/mozharness/rev/9ef0a3e99b55 I will re-trigger the jobs in here and we should see what I mentioned on comment 161 (in an hour from now): https://tbpl.mozilla.org/?tree=Cedar&jobname=Android%204.2%20x86&rev=e6f8b77a8824
Attachment #804623 -
Flags: checked-in+
Assignee | ||
Comment 177•10 years ago
|
||
I see the symbols-path set to a URL but I don't see that the output is any different. gbrown, is there anything else left to be set before this should work? https://tbpl.mozilla.org/php/getParsedLog.php?id=27848103&tree=Ash&full=1#error0 I do see this though: 14:21:49 INFO - mozcrash INFO | Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/ash-android-x86/1379011131/fennec-26.0a1.en-US.android-i386.crashreporter-symbols.zip http://mxr.mozilla.org/mozilla-central/source/testing/mozbase/mozcrash/mozcrash/mozcrash.py#73 I will look at mozcrash's code on Monday.
Flags: needinfo?(gbrown)
Assignee | ||
Comment 178•10 years ago
|
||
Cedar is looking good :) Comment 161 *might* need refreshing on Monday since I can already see that mochitest-3 is crashing when it didn't use to.
![]() |
Reporter | |
Comment 179•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4) from comment #177) > I see the symbols-path set to a URL but I don't see that the output is any > different. That looks like it should work. :ted -- Can you see what is going wrong here? As shown in Comment 177, mozcrash reports that it downloads symbols, but I do not see symbols in the resulting crash dump.
Flags: needinfo?(gbrown) → needinfo?(ted)
![]() |
Reporter | |
Comment 180•10 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #173) > (In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4) > > On another note, do we need to split reftests even more? (currently running > > 10 chunks) or bump a timeout inside of the *test* harness? (not mozharness) > > I think we will need to split reftests more, but I'm not sure how much. We > are also considering running with skia-gl disabled -- some discussion in bug > 907351 -- so I am running some special tests to see how much of a difference > that makes. I'll try to get back to you with a recommendation for # chunks > by Monday morning. Try: crashtests -- 3 chunks js-reftests -- 6 chunks The plain-reftest situation is pretty bad. I think we need 20+ chunks for green runs -- we may want to wait for progress on bug 916657.
Assignee | ||
Comment 181•10 years ago
|
||
Attachment #805281 -
Flags: review?(gbrown)
Assignee | ||
Comment 182•10 years ago
|
||
Attachment #805282 -
Flags: review?(aki)
Assignee | ||
Comment 183•10 years ago
|
||
Carrying forward the part of the patch which adds the tests across the board. For now, this patch is on hold until we get everything green on Cedar.
Attachment #803295 -
Attachment is obsolete: true
Attachment #805288 -
Flags: review+
Assignee | ||
Comment 184•10 years ago
|
||
Summary (these results are from Cedar): Running well: * m-1, m-2 and m-4 are green; m-3 crashes [1] * m-[4-8] are green [2] * xpcshell is running green [4] Suites needing attention: * mochitest-gl is crashing [4] ** gbrown to look into it * robocop-{1,2} are running but 2 tests fail [4] * reftests, crashtest and jsreftests can run but FAIL/timeout [3][5] ** more crashtest and jsreftest chunking will happen today/tomorrow ** reftests will need more investigation - bug 916657 Others: * minidumps are fixed * crash symbols are not giving source code lines ** waiting on "needinfo" for ted * adjust trychooser to handle x86 "sets" of suites approach ** bug 915870 [1] https://tbpl.mozilla.org/php/getParsedLog.php?id=27851558&tree=Cedar&full=1 [2] https://tbpl.mozilla.org/php/getParsedLog.php?id=27848601&tree=Cedar&full=1 [3] https://tbpl.mozilla.org/php/getParsedLog.php?id=27850566&tree=Cedar&full=1 [4] https://tbpl.mozilla.org/php/getParsedLog.php?id=27849762&tree=Cedar&full=1 [5] https://tbpl.mozilla.org/php/getParsedLog.php?id=27850902&tree=Cedar&full=1
Assignee | ||
Updated•10 years ago
|
Whiteboard: [reit-x86] summary in comment 161 → [reit-x86] summary in comment 184
Assignee | ||
Comment 185•10 years ago
|
||
I know that I've said that we should wait until everything is green on Cedar, however, after seeing the bug filed for reftests I wonder if we should go out the door with whatever is green and leave reftests only running on Cedar and Ash. What do you think?
Updated•10 years ago
|
Attachment #805282 -
Flags: review?(aki) → review+
Comment 186•10 years ago
|
||
This isn't a symbols issue, this dump looks completely broken. If you can grab one of these minidumps off of a slave and attach it here or somewhere else we can poke at it.
Flags: needinfo?(ted)
![]() |
Reporter | |
Updated•10 years ago
|
Attachment #805281 -
Flags: review?(gbrown) → review+
![]() |
Reporter | |
Comment 187•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4) from comment #185) > I know that I've said that we should wait until everything is green on > Cedar, however, after seeing the bug filed for reftests I wonder if we > should go out the door with whatever is green and leave reftests only > running on Cedar and Ash. I would prefer that. If the reftest issue takes a while to sort out, we risk losing today's greens over time.
Assignee | ||
Comment 188•10 years ago
|
||
Attachment #805282 -
Attachment is obsolete: true
Attachment #805288 -
Attachment is obsolete: true
Attachment #805514 -
Flags: review?(aki)
Assignee | ||
Comment 189•10 years ago
|
||
Comment on attachment 805281 [details] [diff] [review] chunk jsreftest into 6 chunks and crashtest into 3 https://hg.mozilla.org/build/mozharness/rev/5eca80d07e33
Attachment #805281 -
Flags: checked-in+
Assignee | ||
Updated•10 years ago
|
Attachment #804696 -
Flags: review?(aki)
Updated•10 years ago
|
Attachment #804696 -
Flags: review?(aki) → review+
Comment 190•10 years ago
|
||
Comment on attachment 805514 [details] [diff] [review] enable sets 1 & 2 - move failing suites to sets 3 to 8 on Ash and Cedar Thanks for fixing the dict spacing! That was bugging me.
Attachment #805514 -
Flags: review?(aki) → review+
Assignee | ||
Comment 191•10 years ago
|
||
Comment on attachment 805514 [details] [diff] [review] enable sets 1 & 2 - move failing suites to sets 3 to 8 on Ash and Cedar https://hg.mozilla.org/build/buildbot-configs/rev/e5177c27ce46
Attachment #805514 -
Flags: checked-in+
Assignee | ||
Comment 192•10 years ago
|
||
Comment on attachment 804696 [details] [diff] [review] androidx86.minidump.diff https://hg.mozilla.org/build/mozharness/rev/6ca289a39407
Attachment #804696 -
Flags: checked-in+
Assignee | ||
Comment 193•10 years ago
|
||
Summary (these results are from Cedar): Coming up: * we're enabling sets 1 and 2 across the board (whenever we have a reconfig) ** m-{1,2,4,5,6,7,8} and xpcshell (not m-3) * the remaining suites will run on Cedar and Ash ** as suites get fixed on Cedar we will move them to the all other branches * landed fix for download symbols correctly Running well: * m-1, m-2 and m-4 are green; m-3 crashes [1] * m-[4-8] are green [2] * xpcshell is running green [4] Suites needing attention: * mochitest-gl is crashing [4] ** gbrown to look into it * robocop-{1,2} are running but 2 tests fail [4] * reftests, crashtest and jsreftests can run but FAIL/timeout [3][5] ** more crashtest and jsreftest chunking will happen today/tomorrow ** reftests will need more investigation - bug 916657 Others: * crash symbols are not giving source code lines ** ted and gbrown investigating in bug 916923 * bug 915870 - adjust trychooser to handle x86 "sets" of suites approach [1] https://tbpl.mozilla.org/php/getParsedLog.php?id=27851558&tree=Cedar&full=1 [2] https://tbpl.mozilla.org/php/getParsedLog.php?id=27848601&tree=Cedar&full=1 [3] https://tbpl.mozilla.org/php/getParsedLog.php?id=27850566&tree=Cedar&full=1 [4] https://tbpl.mozilla.org/php/getParsedLog.php?id=27849762&tree=Cedar&full=1 [5] https://tbpl.mozilla.org/php/getParsedLog.php?id=27850902&tree=Cedar&full=1
Whiteboard: [reit-x86] summary in comment 184 → [reit-x86] summary in comment 193
Comment 194•10 years ago
|
||
This should be in production.
Assignee | ||
Comment 195•10 years ago
|
||
We now have sets 1 & 2 running everywhere: https://tbpl.mozilla.org/?jobname=Android%204.2%20x86&rev=1d27c4c9871f It seems like splitting crashtest into 3 chunks made them run green. Should we get those out to other branches? The same with jsreftests, splitting them into 6 made them run green (except jsreftest-5). Would you want to fix #5 before moving them to other branches? robocop-2 is not failing any tests. robocop-1 is only failing 2 tests.
Comment 196•10 years ago
|
||
We don't have proper crash stacks (bug ), eg: https://tbpl.mozilla.org/php/getParsedLog.php?id=27973359&tree=B2g-Inbound https://tbpl.mozilla.org/php/getParsedLog.php?id=27973359&tree=B2g-Inbound https://tbpl.mozilla.org/php/getParsedLog.php?id=27973163&tree=Mozilla-Inbound So Android x86 tests hidden everywhere for now.
Comment 197•10 years ago
|
||
(In reply to Ed Morley [:edmorley UTC+1] from comment #196) > We don't have proper crash stacks (bug ) Bug 916923
![]() |
Reporter | |
Comment 198•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4) from comment #195) > We now have sets 1 & 2 running everywhere: Awesome! > It seems like splitting crashtest into 3 chunks made them run green. > Should we get those out to other branches? I think so. > The same with jsreftests, splitting them into 6 made them run green (except > jsreftest-5). > Would you want to fix #5 before moving them to other branches? I think so -- I'll be looking at jsreftest-5 more closely today. > robocop-2 is not failing any tests. robocop-1 is only failing 2 tests. There are some non-x86 robocop patches landing today that may help. There is a new patch landing in bug 913627 which I hope will green up M3. I hope to have a patch up for review in bug 917053 today to fix M-gl.
Assignee | ||
Comment 199•10 years ago
|
||
It's rather cumbersome to add green test suites to tbpl if they have to be hidden right away (due to tbpl's per-branch nature as well as waiting for them to be scheduled first). Could we try fixing them on Cedar for this week and see what is ready for next week? Could we use bug 891959 for further status updates as well as adding more tests suites to tbpl? FYI, my biggest focus will be bug 915870. I will not be able to look at bug 891959 before next week. Is there anyone besides gbrown that would be interested to give a hand with it? FTR, I need a day or two to meet some Summit Preparation deadlines that are happening this week.
Flags: needinfo?(gbrown)
Flags: needinfo?(blassey.bugs)
Assignee | ||
Comment 200•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4) from comment #199) > FYI, my biggest focus will be bug 915870. > I will not be able to look at bug 891959 before next week. Is there anyone > besides gbrown that would be interested to give a hand with it? > I meant bug 917361 (make it easy for a dev to run the Android x86 test jobs).
![]() |
Reporter | |
Comment 201•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4) from comment #199) > Could we try fixing them on Cedar for this week and see what is ready for > next week? That seems reasonable. With patches on the go, I expect we can turn everything green on Cedar this week, except for plain reftests. > Could we use bug 891959 for further status updates as well as adding more > tests suites to tbpl? OK.
Flags: needinfo?(gbrown)
Assignee | ||
Comment 203•10 years ago
|
||
I have triggered a new set of builds on Cedar where some changes that gbrown landed on m-i will be integrated. A new summary will be given in bug 891959 once those builds complete. This week the focus will be on bug 915870 for the try support. Next week we will add whatever we green out this week.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•5 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•