<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Assignee

Updated

•

16 years ago

Status: NEW → ASSIGNED

bhearsum@mozilla.com (:bhearsum)

Assignee

Comment 1

•

16 years ago

I'd like to get a sanity-check that these VMs do in fact have SSE disabled, but I need help. /proc/cpuinfo is not inspiring confidence, certainly: [cltbld@moz2-linuxnonsse-slave01 builds]$ grep sse /proc/cpuinfo flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss constant_tsc up

Comment 2

•

16 years ago

Mrz might know for sure re: comment#1

Assignee

Comment 3

•

16 years ago

Yeah, I really don't believe these VMs are non-SSE. test.c: #include <stdio.h> int main(int argc, char**argv) { #define cpuid(func,ax,bx,cx,dx)\ __asm__ __volatile__ ("cpuid":\ "=a" (ax), "=b" (bx), "=c" (cx), "=d" (dx) : "a" (func)); int a, b, c, d; cpuid(0x1, a, b, c, d); if (d & (1 << 25)) { printf("sse enabled\n"); } if (d & (1 << 26)) { printf("sse2 enabled\n"); } if (c & (1 << 0)) { printf("sse3 enabled\n"); } return 0; } [cltbld@moz2-linuxnonsse-slave01 builds]$ gcc -o testsse test.c [cltbld@moz2-linuxnonsse-slave01 builds]$ ./testsse sse enabled sse2 enabled sse3 enabled

matthew zeier [:mrz]

Comment 4

•

16 years ago

You should be able to check the VM's config - Phong, is that right?

Assignee

Comment 5

•

16 years ago

I don't think this will work at all, per VMWare (quoted here): http://www.novosco.com/articles/2008/08/19/vmware-esx-and-enhanced-vmotion-compatibility/ I don't know if we're using Enhanced VMotion Compatibility or not, but if not: * SSE features can be used by user-level code (applications). * Mask does not work for user-level code (i.e. applications). * In user-level code, CPUID is executed directly on hardware and is not intercepted by VMware. * Thus, VM cannot reliably hide SSE from an application Even if we are: EVC utilizes hardware support to modify the semantics of the CPUID instruction only. It does not disable the feature itself. For example, if an attempt to disable SSE4.1 is made by applying the appropriate masks to a CPU that has these features, this feature bit indicates SSE4.1 is not available to the guest or the application, but the feature and the SSE4.1 instructions themselves (such as PTESE and PMULLD) are still available for use. This implies applications that do not use the CPUID instruction to determine the list of supported features, but use try‐catch undefined instructions (#UD) instead, can still detect the existence of this feature. This won't let us test what we're trying to test.

Status: ASSIGNED → RESOLVED

Closed: 16 years ago

Resolution: --- → WONTFIX

Assignee

Comment 6

•

16 years ago

I'm trolling for community help, there's probably someone out there with older hardware that we can get to do this: http://forums.mozillazine.org/viewtopic.php?f=23&t=1247655

Comment 7

•

16 years ago

http://www.nongnu.org/qemu/qemu-tech.html#SEC3 Qemu does not have SSE support. It also loads vmdk if i am not mistaken, and it runs on windows, osx and linux.

Comment 8

•

16 years ago

found an old P3 machine Egg, going to do a unit test run. Maybe I need to find a spare disk that i can format

Status: RESOLVED → REOPENED

Resolution: WONTFIX → ---

Assignee

Updated

•

16 years ago

Assignee: ted.mielczarek → jford

Status: REOPENED → ASSIGNED

Summary: try a one-off unittest run on non-SSE VMs → try a one-off unittest run on some random old box that reed found

Assignee

Updated

•

16 years ago

Summary: try a one-off unittest run on some random old box that reed found → try a one-off unittest run on some random old box that reed found (non-SSE)

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Comment 9

•

16 years ago

I have two old HP servers (BTek, Spider) that I have been given the ok to use by Reed. I am just about done installing ubuntu on one and I am waiting on an XP license for the other one.

Comment 10

•

16 years ago

(In reply to comment #8) > found an old P3 machine Egg, going to do a unit test run. Maybe I need to > find a spare disk that i can format Actually, there were 3 machines: btek, spider and egg. egg turned out to be way older, so we cannibalized parts from egg to increase RAM and replace useless video card in btek. btek now has ubuntu v9.04 installed, with a cltbld account account on it. However, it still needs network configuration, DNS configs, etc. spider now has WinXP installed, with a license key, with a cltbld account on it. It also needs network configuration, DNS configs, etc. These will also need VNC (or RDP) installed for Ted to be able to remotely connect and use them for running tests. Per discussion with shaver and damons this morning about other priorities, these machines are being handed back to IT to finish the o.s. setup. Once both are ready, please reassign back, so Ted can try a manual unittest run on them.

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Comment 11

•

16 years ago

(In reply to comment #3) > Yeah, I really don't believe these VMs are non-SSE. test.c: > #include <stdio.h> > > int main(int argc, char**argv) > { > #define cpuid(func,ax,bx,cx,dx)\ > __asm__ __volatile__ ("cpuid":\ > "=a" (ax), "=b" (bx), "=c" (cx), "=d" (dx) : "a" (func)); > > int a, b, c, d; > cpuid(0x1, a, b, c, d); > > if (d & (1 << 25)) { printf("sse enabled\n"); } > if (d & (1 << 26)) { printf("sse2 enabled\n"); } > if (c & (1 << 0)) { printf("sse3 enabled\n"); } > > return 0; > } > [cltbld@moz2-linuxnonsse-slave01 builds]$ gcc -o testsse test.c > [cltbld@moz2-linuxnonsse-slave01 builds]$ ./testsse > sse enabled > sse2 enabled > sse3 enabled Also, jhford ran ted's diagnostic program on btek, and only got "sse enabled" as expected, for these machines are dual-P3 CPUs running at 500MHz. As expected, these machines do not have sse2, sse3 enabled.

Summary: try a one-off unittest run on some random old box that reed found (non-SSE) → manually run unittest on two old non-SSE2 boxes

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Updated

•

16 years ago

Assignee: jford → server-ops

Component: Release Engineering → Server Operations

Flags: blocking1.9.1+

OS: Linux → All

QA Contact: release → mrz

Hardware: x86 → All

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Updated

•

16 years ago

Blocks: 463262

matthew zeier [:mrz]

Comment 12

•

16 years ago

> > Per discussion with shaver and damons this morning about other priorities, > these machines are being handed back to IT to finish the o.s. setup. Once both > are ready, please reassign back, so Ted can try a manual unittest run on them. i know we talked about this on the phone but what IT steps are left? Boxes are up and running?

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Comment 13

•

16 years ago

(In reply to comment #12) > > > > Per discussion with shaver and damons this morning about other priorities, > > these machines are being handed back to IT to finish the o.s. setup. Once both > > are ready, please reassign back, so Ted can try a manual unittest run on them. > > i know we talked about this on the phone but what IT steps are left? Boxes are > up and running? Boxes now reassembled and at reed's desk. They'd need to be racked somewhere (downstairs in K?), and then also need the following from comment#10: "...network configuration, DNS configs, etc. These will also need VNC (or RDP) ..."

matthew zeier [:mrz]

Comment 14

•

16 years ago

Reed gets this because they're sitting next to his desk :)

Assignee: server-ops → reed

Reed Loden [:reed]

Comment 15

•

16 years ago

spider is racked and cabled... 5/19 on the switch just needs its vlan changed from 200 to 500, and it'll be ready to go. I just turned RDP on for now. If you need VNC, you're welcome to install it. It'll be accessible at spider.office.mozilla.org within the MV Office VPN once the vlan has been changed and the networking restarted. btek, on the other hand, is dead. When it was plugged in, its power supply instantly died and made smelly smoke, as the power supply was set for 115 instead of 230. We can either try to replace the power supply or just get another box instead. Thoughts?

Assignee

Comment 16

•

16 years ago

I'd go with whatever you think is fastest.

Assignee

Comment 17

•

16 years ago

I've got a mochitest run started on spider (WinXP). I downloaded the latest 1.9.1 unittest build that was available, which was this one: http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-1.9.1-win32-unittest/1242749541/

Comment 18

•

16 years ago

btek is broken. I have balsa here and I can move the good hardware from btek into balsa when there is some spare time.

Comment 19

•

16 years ago

I have rebuilt Balsa's hardware to have 2x500MHz P3 cpus that are identical to the ones in spider. I have also installed the scsi card and drives from btek but it isn't booting properly and the hard drives are not being picked up my the scsi bios. If the scsi card cannot be coerced into working there are some ATA drives left over from egg which can be used but require a reinstall of linux. I have the rebuilt balsa and the reminents of Egg and Btek by my desk. What do I do with them? Egg is totally broken but btek could be useful for spares.

Assignee

Comment 20

•

16 years ago

The mochitest run on spider finished without crashing. I'll run through the rest of the test suites today.

Assignee

Comment 21

•

16 years ago

I ran through all of our test suites (mochitest, mochitest chrome, mochitest browser-chrome, mochitest a11y, reftest, crashtest, xpcshell tests) on spider. There were some test failures (that I didn't look into very deeply, but most look like the same kind of intermittent failures as on tinderbox), but no crashes.

Comment 22

•

16 years ago

i have rebuilt balsa and it does not work at all. The options I can think of include running a dual boot on spider, which effectively makes automation impossible or finding new hardware.

Updated

•

16 years ago

Assignee: reed → nobody

Component: Server Operations → Release Engineering

QA Contact: mrz → release

Comment 23

•

16 years ago

Moving to releng.

Flags: blocking1.9.1+

Mike Shaver (:shaver emeritus)

Comment 24

•

16 years ago

And marking blocking 1.9.1+. We need to run this after all JS bugs are in, and before RC, and before each RC.

Comment 25

•

16 years ago

(In reply to comment #21) > I ran through all of our test suites (mochitest, mochitest chrome, mochitest > browser-chrome, mochitest a11y, reftest, crashtest, xpcshell tests) on spider. > There were some test failures (that I didn't look into very deeply, but most > look like the same kind of intermittent failures as on tinderbox), but no > crashes. We need to look at the test failures: only one failure mode (generating SSE2 code on a non-SSE2 machine, and calling it) will result in a SIGILL crash. We also need to know that the x87/non-SSE2 code that we generate is correct!

Assignee

Comment 26

•

16 years ago

Ok, I'll collate and attach them to the bug in a bit.

Assignee

Updated

•

16 years ago

Assignee: nobody → ted.mielczarek

Assignee

Comment 27

•

16 years ago

First, the simple: crashtest, mochitest-a11y, xpcshell: 0 failures

Assignee

Comment 28

•

16 years ago

Attached file mochitest-chrome failures — Details

mochitest-chrome: 1 failure

Assignee

Comment 29

•

16 years ago

Attached file mochitest-browser-chrome failures — Details

mochitest-browser-chrome: 6 failures 2 of these are because bug 475383 hasn't landed on branch. The others may just be fallout from that failure, I didn't investigate fully.

Assignee

Comment 30

•

16 years ago

reftest had a bunch of failures, but then I noticed that the first one was colordepth.html, and realized that my RDP connection was 16 bit, so those failures are probably all a result of that.

Assignee

Comment 31

•

16 years ago

Attached file mochitest failures — Details

mochitest failures: 21 9 of these are known: 2 are from bug 475383 again, 7 are from the geolocation tests (bug 489817). I didn't investigate the rest.

Mike Beltzner [:beltzner, not reading bugmail]

Assignee

Comment 32

•

16 years ago

I'll fire off another run today as well. (on the same build)

Comment 33

•

16 years ago

What happened with that run, please?

Johnathan Nightingale [:johnath]

Comment 34

•

16 years ago

Do we have updated info here?

Assignee

Comment 35

•

16 years ago

Sorry, lost track of this over the weekend. Summary from the second run: mochitest-plain: somewhat different results, will attach log in a minute mochitest-chrome: exact same result as previous run mochitest-browser-chrome: one additional failure: TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/browser/components/preferences/tests/browser_privacypane_1.js | Timed out Anything not mentioned still had zero failures.

Mike Shaver (:shaver emeritus)

Assignee

Comment 36

•

16 years ago

Attached file mochitest failures, second run — Details

Ok, there are 23 failures in this log, of which 9 are known (as before, the plugin tests and geolocation tests).

Comment 37

•

16 years ago

So there are 14 bugs to file, I guess. :-( If we're seeing consistent fails on mochitest-chrome, doesn't that mean that they're probably not just the usual sometimes-orange randoms?

Assignee

Comment 38

•

16 years ago

The mochitest-chrome failure was a known random that didn't get a fix backported to branch, bug 468189. (Although interestingly on this machine it sure seems repeatable!)

Assignee

Comment 39

•

16 years ago

I think the browser-chrome failures are all fallout from the plugin test failing. It opens a tab, and then doesn't clean it up if it doesn't finish successfully. Should file a bug on making that test cleanup after itself better.

Assignee

Comment 40

•

16 years ago

In the mochitest failures, I looked at: 31268 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/tests/mochitest/ajax/offline/test_fallback.html | Fallback page displayed for top level document I think this test is broken, it has a 3 second timeout internally: http://mxr.mozilla.org/mozilla-central/source/dom/tests/mochitest/ajax/offline/test_fallback.html?force=1#71 This machine is *really* slow, so it wouldn't surprise me if we hit that.

Assignee

Comment 41

•

16 years ago

(In reply to comment #39) > I think the browser-chrome failures are all fallout from the plugin test > failing. It opens a tab, and then doesn't clean it up if it doesn't finish > successfully. Should file a bug on making that test cleanup after itself > better. I re-ran browser-chrome with the plugin test moved out of the way, and got just one failure: TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/browser/components/places/tests/perf/browser_ui_history_sidebar.js | Timed out Suspiciously, this is in a "tests/perf" directory, and it looks like the test does a lot of work. The browser chrome harness has a 30 second timeout, so it seems likely that this test just can't finish in time.

Comment 42

•

16 years ago

How many times have we looped through these test runs so far?

Assignee

Comment 43

•

16 years ago

Just two runs through the full test suite, on the same build (mentioned in comment 17). Happy to do more runs, or on a newer build, whatever floats your boat.

Updated

•

16 years ago

Depends on: 495500

Comment 44

•

16 years ago

Ted, be ready to run these on notice. I'm guessing we'll want to run this before we ship the RC.

Assignee

Comment 45

•

16 years ago

Will do, I was planning on grabbing a build from this morning and giving it another run.

Mike Shaver (:shaver emeritus)

Comment 46

•

16 years ago

Might be time to run this again?

Comment 47

•

16 years ago

Yeah, can use the b99 builds when they're out.

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Assignee

Comment 48

•

16 years ago

I re-ran this on a build from thursday(?) and got extremely similar results, although I didn't finish analysis. I think this box is currently MIA due to the office move, so hopefully someone can plug it back in on monday.

Comment 49

•

16 years ago

(In reply to comment #48) > I think this box is currently MIA due to the > office move, so hopefully someone can plug it back in on monday. Both nonsse machines are AWOL. They didnt show up in new server room, or any of RelEng desks in new office. I already went back to Building K server lab this morning, and they are not there. I'll go back and search a few other rooms in Building K later today.

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Comment 50

•

16 years ago

One non-sse machine was in the server room but was off. It is now connected using a dhcp address of 10.250.6.227 but i am working on getting a dns hostname for it in bug 496946 This machine is a P3-500MHz with 384MB of ram. There is ssh working and I will email the username and password to ted

Comment 51

•

16 years ago

(In reply to comment #49) > (In reply to comment #48) > Both nonsse machines are AWOL. They didnt show up in new server room, or any of > RelEng desks in new office. I already went back to Building K server lab this > morning, and they are not there. > > I'll go back and search a few other rooms in Building K later today. John Ford and myself went dumpster-diving in the old Building K and S. We found the nonsse machine, as well as a few other nonsse and ppc machines, and brought them all back to the new office. We should have the pre-existing nonsse machine back online today sometime, and will find out how many of the other machines even work at all. Very happy with the additional nonsse and ppc machines found; quite a productive afternoon's scavenging!!

Assignee

Comment 52

•

16 years ago

I've got a Mochitest run started on the Linux machine.

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Comment 53

•

16 years ago

Looks like we're done with all blockers for RC. Need to run everything again?

Comment 54

•

16 years ago

(In reply to comment #51) > (In reply to comment #49) > > (In reply to comment #48) [snip] > We should have the pre-existing nonsse machine back online today sometime... Forgot to update this bug earlier. jhford got the nonsse win32 machine up and running again Tues. DNS is still a bit unsettled in new office, but these IPs work: linux: 10.250.6.227 win32: 10.250.5.20

Reed Loden [:reed]

Comment 55

•

16 years ago

(In reply to comment #54) > DNS is still a bit unsettled in new office, but these IPs > work: > > linux: 10.250.6.227 > win32: 10.250.5.20 Are there bugs on file to get these assigned static IPs?

Mike Beltzner [:beltzner, not reading bugmail]

Comment 56

•

16 years ago

there is one for goat, the linux one (bug 496946). The windows one had a working one before (spider) but i guess it was removed when it was moved to the junk pile. I can file a seperate bug or expand the linux one. either works for me.

Comment 57

•

16 years ago

Ted: can you run a set of unit tests on RC3 using these boxes so we can close this out?

Assignee

Comment 58

•

16 years ago

I'm OOTO today, and traveling this weekend, so I can't get to it until Monday. If you want it sooner than that you'll have to find someone else, sorry.

Joel Maher ( :jmaher ) (UTC -8)

Comment 59

•

16 years ago

Adding Joel Maher as he will be running the tests this afternoon.

Comment 60

•

16 years ago

I am seeing a LOT more errors when I did a run on linux/windows this weekend. For example the linux mochitests have 331 failures (I ran twice to verify)! Also the linux browser-chrome tests did not finish (verified twice) as they were hung on sessionrestore tests! # of failures test linux windows xpcshell 0 0 reftest 3 123 crashtest 0 0 mochitest 331 13 chrome 9 0 browser-chrome 20 10 a11y 0 0

Joel Maher ( :jmaher ) (UTC -8)

Assignee

Comment 61

•

16 years ago

I don't believe I ever ran the unittests on that Linux box, as it didn't exist when I started this testing. The windows reftest results may be completely wrong, as you have to be careful to connect using 24-bit color with remote desktop. The mochitest/browser-chrome results look to be in line with what I saw, and were all harmless failures (tests relying on the test plugin, which is a known failure on branch packaged tests currently, or tests that are intermittent failures/timeout on slow hardware).

Comment 62

•

16 years ago

let me try the reftests again on windows. thanks for the data Ted.

Joel Maher ( :jmaher ) (UTC -8)

Comment 63

•

16 years ago

This is the failure log after re-running browser-chrome tests after removing: mochitest/browser/browser/base/content/test/browser_pluginnotification.js cltbld@SPIDER /c/ff35_unittest/mochitest $ grep UNEXPECTED-FAIL bchrome.log TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/browser/components/plac es/tests/browser/browser_410196_paste_into_tags.js | Timed out TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/browser/components/plac es/tests/perf/browser_ui_history_sidebar.js | Timed out TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/browser/components/pref erences/tests/browser_privacypane_1.js | Timed out TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/browser/components/pref erences/tests/browser_privacypane_2.js | Timed out TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/browser/components/pref erences/tests/browser_privacypane_3.js | Timed out TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/browser/components/pref erences/tests/browser_privacypane_4.js | Timed out TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/browser/components/pref erences/tests/browser_privacypane_5.js | Timed out TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/browser/components/pref erences/tests/browser_privacypane_6.js | Timed out TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/browser/components/pref erences/tests/browser_privacypane_7.js | Timed out cltbld@SPIDER /c/ff35_unittest/mochitest $

Mike Beltzner [:beltzner, not reading bugmail]

Comment 64

•

16 years ago

Can we get an assessment here of whether or not we are good to go?

Comment 65

•

16 years ago

I'm pretty sure we're good, based on that log.