Open Bug 1209932 Opened 9 years ago Updated 2 years ago

Disable all 32-bit linux testing

Categories

(Testing :: General, defect)

defect

Tracking

(firefox47 fixed)

REOPENED
mozilla47
Tracking Status
firefox47 --- fixed

People

(Reporter: coop, Unassigned)

References

(Depends on 1 open bug)

Details

(Whiteboard: [capacity][linux][test])

Attachments

(7 files, 10 obsolete files)

4.80 KB, patch
jmaher
: review+
Details | Diff | Splinter Review
929 bytes, patch
kmoir
: review+
Details | Diff | Splinter Review
172.03 KB, text/plain
Details
6.79 KB, text/plain
Details
5.86 KB, patch
kmoir
: review+
Details | Diff | Splinter Review
21.99 KB, patch
Details | Diff | Splinter Review
457 bytes, patch
kmoir
: review+
Details | Diff | Splinter Review
In bug 1204920, we recently decided to re-purpose the talos-linux32 machines because we decided that the performance data we were getting from them was not sufficiently different from that on linux64.

How far should we take this? (I want to take it much farther)

How different are the correctness test results (unittests, etc) we get from the tst-linux32-spot instances when compared to the results from tst-linux64? 

Turning off 900 tst-linux32 instances would put a significant dent in our AWS bills and simply our configs a lot.

If we turn off the correctness tests for linux32, how useful is it to keep generating the 32-bit linux builds? 

In short, I'd love to get us down to a single linux platform (64bit) for both build and test. We fought hard to get linux64 turned on a few years ago. I hope we can now turn linux32 off.
roc and ehsan were both in favor of dropping Talos support in the dev.platform thread. If we drop testing from linux32 we'd have to make it Tier-2 at the very least. If we drop builds it's effectively going to become unsupported or Tier-3 if someone from the community wants to maintain it. (Maybe that's OK.)

We are still doing 32-bit Windows builds and 32-bit Mac builds (for now), so the likelihood of actually breaking 32-bit Linux is fairly low even without builds, but if we turn them off it will happen eventually.
Can we make it Tier 2, and run them only periodically?
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #1)
> We are still doing 32-bit Windows builds and 32-bit Mac builds (for now), so
> the likelihood of actually breaking 32-bit Linux is fairly low even without
> builds, but if we turn them off it will happen eventually.

That's the crux here, really: how important are 32-bit linux builds to Mozilla, and is 32-bit coverage on Mac/Windows good enough?

kmoir was also going to look into whether we had other classes of tests running on the tst-linux32-spot machines that weren't coming directly from 32-bit linux builds. I think the tst-emulator64 platform is the only one overloaded that way, but I'm not sure.

(In reply to Chris AtLee [:catlee] from comment #2)
> Can we make it Tier 2, and run them only periodically?

That would be an acceptable compromise if we can't kill them completely.
I looked and I believe all tests running on tst-linux32-spot instances are coming from 32-bit linux builds
What consensus is needed to move forward with this?  Do we want to run the builds periodically without tests as a Tier 2 platform or remove them entirely?
FWIW, https://groups.google.com/d/msg/mozilla.dev.planning/wfHf_qPjoB0/nHtlnhrJeIsJ was what stopped us dead in our tracks the last time we wanted to do this, so rather than looking at "there's only one test which is enabled on linux32 but disabled on linux64" (true, unless there are ways of doing it that I haven't thought of), you have to look at how many tests are disabled on b2g but enabled on linux32, and get b2g management buy-in that it's okay to drop their testing proxy.
Another suggestion from the dev-platform thread was to 

>>Pending more data about how many people use 32-bit fx on linux distros, we could conceivably do both (1) and turn off 32-bit *debug* testing completely, as nobody really uses those as daily builds anyway, and most any issue should get caught by the opt tests or the x64 debug ones?
Adding sheriffs; if we opt just to reduce the frequency of linux32 tests, we should get sheriff buy-in first. It won't save anything but our AWS bill, and it may introduce bisection woes that could e.g., increase the duration of tree closures.
Sheriffs, do you have any feedback on this? 

https://groups.google.com/forum/#!topic/mozilla.dev.planning/wBgLRXCTlaw
Flags: needinfo?(sheriffs)
Flags: needinfo?(sheriffs) → needinfo?(wkocher)
With automatic backfilling hopefully working properly now, I think reducing the frequency should still be okay.

Tomcat/Ryan, what do you think?
Flags: needinfo?(wkocher)
Flags: needinfo?(ryanvm)
Flags: needinfo?(cbook)
Was comment 6 ever addressed?
Flags: needinfo?(ryanvm)
Comment #6 refers to a post from 2013.  Does this still reflect the current state?  Are there still b2g tests that are missing and are verified on the linux32 builds?  Who would have this information?
http://mxr.mozilla.org/mozilla-central/search?string=skip-if.*b2g&regexp=on&find=&findi=&filter=^[^\0]*%24&hitlimit=&tree=mozilla-central is the first 1000 of the unknowable total number of tests which are skipped on b2g, some of them because they test something which is disabled or not built on b2g, some of them because the test fails on b2g though it shouldn't, some of them because the test is too intermittent on b2g, some of them because nobody is responsible for reenabling tests after the initial triage disables huge swaths.
Flags: needinfo?(cbook)
Depends on: 1236835
Depends on: 1239082
Attached you can find the patch can be used to disable linux 32
Attachment #8717946 - Flags: review?(kmoir)
Attached file differences (obsolete) —
Attached you can find the difference
Assignee: nobody → vlad.ciobancai
Here's a fun thing nobody thought about:

What entire test suites are run on Linux32, but are not run anywhere else?
This bug is dependent on removing the b2g desktop builds.  Earlier the issue was raised that b2g desktop builds are dependent on linux32 test results, but with them going away the thinking was that this was no longer a blocker.
It wasn't so much "b2g desktop" as "b2g the entire project," that the Linux32 tests were the best proxy they had for all the tests that weren't run on any sort of b2g build at all, but we've thrown them to the wolves anyway.

But having just redone the "(whatever-platform only)" labels on http://trychooser.pub.build.mozilla.org/ this weekend, I'm freshly aware that we actually only run Marionette-e10s on Linux32 opt, nowhere else.
(In reply to Phil Ringnalda (:philor) from comment #18)
> But having just redone the "(whatever-platform only)" labels on
> http://trychooser.pub.build.mozilla.org/ this weekend, I'm freshly aware
> that we actually only run Marionette-e10s on Linux32 opt, nowhere else.

Oh good catch!  As far as I’m aware, there’s nothing preventing us from running Mn-e10s on Linux x64.
vladC can you open a bug to run Mn-e10s on linux64 and write a patch to add the builders? No hurry (bug 1218589 is a higher priority)
:armenzg, can you add mn-e10s to taskcluster configs?
Flags: needinfo?(armenzg)
Absolutely. Would we really need to add it for Buildbot if it works on TaskCluster?
Flags: needinfo?(armenzg)
FTR, I would be enabling Mn e10s for Linux*64* *debug* jobs.

I assume Buildbot can go ahead for other platforms and architectures.
(In reply to Armen Zambrano [:armenzg] - Engineering productivity from comment #22)
> Absolutely. Would we really need to add it for Buildbot if it works on
> TaskCluster?

I have no strong feelings about this.  I’m happy with going just with TC.
(In reply to Armen Zambrano [:armenzg] - Engineering productivity from comment #24)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=a8ae58d5bb87

From what can I see your test run successfully, do you want us to add MARIONETTE_E10S for linux64 ?
Should I be concerned that the full log of that Try test run doesn't contain a single e10s reference?
(In reply to Vlad Ciobancai [:vladC] from comment #26)
> (In reply to Armen Zambrano [:armenzg] - Engineering productivity from
> comment #24)
> > https://treeherder.mozilla.org/#/jobs?repo=try&revision=a8ae58d5bb87
> 
> From what can I see your test run successfully, do you want us to add
> MARIONETTE_E10S for linux64 ?

Let's skip Marionetter L64 debug for Buildbot.

(In reply to Ryan VanderMeulen [:RyanVM] from comment #27)
> Should I be concerned that the full log of that Try test run doesn't contain
> a single e10s reference?

Fixed in the next push.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b0bb103ac7b5
correct, lets focus on taskcluster, we are moving forward.
right, not buildbot, taskcluster is the future :-)  thanks
Comment on attachment 8717946 [details] [diff] [review]
bug1209932_buildbot-configs.patch

Looks good!  I have a few questions

Is ubuntu32_vm going to be removed too as a worker platform or is it still in use somewhere else? If not there will be puppet changes required.

When I land patches like this I usually land the changes to mozilla-tests/BuildSlaves.py.template after the first patch so existing jobs can complete and connect to masters.

This patch can't land until the existing dependencies have been fixed.

Did you run test-masters.sh on these changes - were there any issues?

Not sure if this change needs to ride the trains or not, will have to ask coop or someone about that.
Attachment #8717946 - Flags: feedback+
Armen, have the Mn and Mn-e10s tests been enabled in taskcluster for Linux64 debug?  Looking at treeherder I didn't see them in the list of tier2 jobs in a recent push.  We'd like to move forward with disabling linux32 so this large patch doesn't become too bitrotten.
Flags: needinfo?(armenzg)
It's looking green to me:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=341cce0738fc

I also see the --e10s flag.
Flags: needinfo?(armenzg)
Attachment #8720904 - Flags: review?(jmaher)
Comment on attachment 8720904 [details] [diff] [review]
Add Marionette plain and e10s to TaskCluster jobs

Review of attachment 8720904 [details] [diff] [review]:
-----------------------------------------------------------------

thanks!
Attachment #8720904 - Flags: review?(jmaher) → review+
Thanks Armen and Joel - much appreciated!
https://hg.mozilla.org/mozilla-central/rev/d2446cb49fe8
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
I will reopen the bug in order to attach the patches to disable linux32
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached you can find attached the patch to disable ubuntu32 and ubuntu32_vm-b2gdt from mozilla-tests/BuildSlaves.py.template
Attachment #8721931 - Flags: review?
Attachment #8721931 - Flags: review? → review?(kmoir)
Attached you can find the patch to disable ubuntu32 and ubuntu32_vm-b2gdt from puppet
Attachment #8721932 - Flags: review?(kmoir)
Updated the patch
Attachment #8717946 - Attachment is obsolete: true
Attachment #8717946 - Flags: review?(kmoir)
Attachment #8721963 - Flags: review?(kmoir)
Attached file bug1209932_test-masters.sh.output (obsolete) —
Attached you can find the output from test-master.sh. For the script I used the following option --unittests-only
Attachment #8721931 - Flags: review?(kmoir) → review+
Attachment #8721932 - Flags: review?(kmoir) → review+
Comment on attachment 8721964 [details]
bug1209932_test-masters.sh.output

can you run it with just test-masters.sh too?
Attached file bug1209932_test-masters.sh - output_v2 (obsolete) —
Attached you can find the output from test-master.sh command.

From what I can see the master/config_seta.py file is not used from my buildbot master (in dev-master2:/builds/buildbot/vlad.ciobancai/test-linux) but is used from a new clone created in a temporary directory where I updated the following line platform_exclusions = ['Ubuntu VM 12.04']

:kmoir do you have any suggestions ?
Flags: needinfo?(kmoir)
if seta is not being used, do you have DISABLE_SETA set in your env?  

from buildbot-configs/mozilla-tests/config_seta.py

 if os.environ.get('DISABLE_SETA'):
        return []
Flags: needinfo?(kmoir)
Comment on attachment 8721963 [details] [diff] [review]
bug1209932_buildbot-configs.patch

This looks good, but I would advise asking in #release-drivers if it's okay to disable  Linux 32 full scale or if this is something that should ride the trains.  Also, is the builder diff for this patch the same as the one for the last patch?
(In reply to Kim Moir [:kmoir] from comment #49)
> Also, is the builder diff for this patch the same as the one for
> the last patch?

The builder diff did not changed and it is the same. I thought there is no need to upload it again
(In reply to Kim Moir [:kmoir] from comment #48)
> if seta is not being used, do you have DISABLE_SETA set in your env?  
> 
> from buildbot-configs/mozilla-tests/config_seta.py
> 
>  if os.environ.get('DISABLE_SETA'):
>         return []

Even if I setup DISABLE_SETA as env I'm still receiving errors like in this attachment https://bugzilla.mozilla.org/attachment.cgi?id=8722067

The commands that I tested were:
1. env DISABLE_SETA=DISABLE_SETA ./test-masters.sh
2. export DISABLE_SETA=DISABLE_SETA ; ./test-masters.sh
When I run the test-masters script in stead of using my master.cfg file a new master.cfg file used that was cloned from repository.

The difference between those files are this:


> diff --git a/mozilla-tests/universal_master_sqlite.cfg b/mozilla-tests/universal_master_sqlite.cfg
> --- a/mozilla-tests/universal_master_sqlite.cfg
> +++ b/mozilla-tests/universal_master_sqlite.cfg
> @@ -67,6 +67,8 @@ from buildbot.buildslave import BuildSla
>  # Handle active platforms - Firefox
>  all_slave_platforms = []
>  for p in ACTIVE_PLATFORMS.keys():
> +    if p not in PLATFORMS:
> +        continue
>      ACTIVE_PLATFORMS[p] = deepcopy(PLATFORMS[p])
>      # Handle active slave platforms
>      if p in ACTIVE_FX_SLAVE_PLATFORMS:
So to find the problem

I commented out your changes in on your master in test-masters.sh

I also exported the python path and run source ./bin/activate as described here

https://wiki.mozilla.org/ReleaseEngineering:TestingTechniques#test-masters.sh

The result is this.  I think you may have to update some of the master definitions to exclude linux.  As an aside, it might be easier to implement this in two steps.  1) disable linux tests 2) disable linux builds 3) disable platform definitions in puppet etc.

INFO  - finished printing log file '/builds/buildbot/vlad.ciobancai/test-linux/buildbot-configs/test-output/bm01-tests1-linux32-ryyoQb-checkconfig.log'
ERROR - TEST-FAIL bm01-tests1-linux32 failed to run checkconfig
INFO  - log for "bm01-tests1-linux32" is "/builds/buildbot/vlad.ciobancai/test-linux/buildbot-configs/test-output/bm01-tests1-linux32-ryyoQb-checkconfig.log"
INFO  - TEST-SUMMARY: 20 tested, 1 failed
INFO  - FAILED-MASTER bm01-tests1-linux32, log: 'test-output/bm01-tests1-linux32-ryyoQb-checkconfig.log', dir: 'test-output/bm01-tests1-linux32-ryyoQb'
INFO  - creating "bm103-tests1-linux" master
INFO  - created  "bm103-tests1-linux" master, running checkconfig
INFO  - starting to print log file '/builds/buildbot/vlad.ciobancai/test-linux/buildbot-configs/test-output/bm103-tests1-linux-oEUbqp-checkconfig.log'
INFO  - /builds/buildbot/vlad.ciobancai/test-linux/lib/python2.6/site-packages/twisted/mail/smtp.py:10: DeprecationWarning: the MimeWriter module is deprecated; use the email package instead
INFO  -   import MimeWriter, tempfile, rfc822
INFO  - Traceback (most recent call last):
INFO  -   File "/builds/buildbot/vlad.ciobancai/test-linux/lib/python2.6/site-packages/buildbot-0.8.2_hg_864ae17f7dda_production_0.8-py2.6.egg/buildbot/scripts/runner.py", line 1042, in doCheckConfig
INFO  -     ConfigLoader(configFileName=configFileName)
INFO  -   File "/builds/buildbot/vlad.ciobancai/test-linux/lib/python2.6/site-packages/buildbot-0.8.2_hg_864ae17f7dda_production_0.8-py2.6.egg/buildbot/scripts/checkconfig.py", line 31, in __init__
INFO  -     self.loadConfig(configFile, check_synchronously_only=True)
INFO  -   File "/builds/buildbot/vlad.ciobancai/test-linux/lib/python2.6/site-packages/buildbot-0.8.2_hg_864ae17f7dda_production_0.8-py2.6.egg/buildbot/master.py", line 652, in loadConfig
INFO  -     exec f in localDict
INFO  -   File "/builds/buildbot/vlad.ciobancai/test-linux/buildbot-configs/test-output/bm103-tests1-linux-oEUbqp/master.cfg", line 70, in <module>
INFO  -     ACTIVE_PLATFORMS[p] = deepcopy(PLATFORMS[p])
INFO  - KeyError: u'linux'
:kmoir I created patches to disable linux tests for buildbot-configs repository, I applied them on dev-master2:/builds/buildbot/vlad.ciobancai/test-linux2 but I'm receiving the following error:

(test-linux2)[vlad.ciobancai@dev-master2.bb.releng.use1.mozilla.com test-linux2]$ make checkconfig
cd master && /builds/buildbot/vlad.ciobancai/test-linux2/bin/buildbot checkconfig
/builds/buildbot/vlad.ciobancai/test-linux2/lib/python2.6/site-packages/twisted/mail/smtp.py:10: DeprecationWarning: the MimeWriter module is deprecated; use the email package instead
  import MimeWriter, tempfile, rfc822
Traceback (most recent call last):
  File "/builds/buildbot/vlad.ciobancai/test-linux2/lib/python2.6/site-packages/buildbot-0.8.2_hg_0e8314b01a9e_production_0.8-py2.6.egg/buildbot/scripts/runner.py", line 1042, in doCheckConfig
    ConfigLoader(configFileName=configFileName)
  File "/builds/buildbot/vlad.ciobancai/test-linux2/lib/python2.6/site-packages/buildbot-0.8.2_hg_0e8314b01a9e_production_0.8-py2.6.egg/buildbot/scripts/checkconfig.py", line 31, in __init__
    self.loadConfig(configFile, check_synchronously_only=True)
  File "/builds/buildbot/vlad.ciobancai/test-linux2/lib/python2.6/site-packages/buildbot-0.8.2_hg_0e8314b01a9e_production_0.8-py2.6.egg/buildbot/master.py", line 652, in loadConfig
    exec f in localDict
  File "/builds/buildbot/vlad.ciobancai/test-linux2/master/master.cfg", line 154, in <module>
    BRANCH_UNITTEST_VARS['platforms'])
  File "/builds/buildbot/vlad.ciobancai/test-linux2/lib/python2.6/site-packages/buildbotcustom/misc.py", line 2075, in generateTalosBranchObjects
    if platform_config.get('is_mobile', False):
AttributeError: 'NoneType' object has no attribute 'get'
make: *** [checkconfig] Error 1
Flags: needinfo?(kmoir)
Perhaps linux is defined a talos platform somewhere, perhaps check project_branches.py.  Strange that is it complaining about mobile when you only changed desktop.
Flags: needinfo?(kmoir)
Created the following patch in order to disable linux tests. I run a manually make checkconfig and everything seems OK.

I created a master_config.json file where I added my master like production-masters.json (path dev-master2:/home/vlad.ciobancai/master_config.json.prod)

I updated test-master to use my json file and to do not download the production-masters.json. I run source ./bin/activate in order to use buildbot env and after that I run test-master.sh and I'm receiving the same error "KeyError: u'linux'"

From what I can see the master.cfg file is not used from master directory but is used from a new clone  for example : /builds/buildbot/vlad.ciobancai/test-linux2/buildbot-configs/test-output/sm-vlad.ciobancai-AjWxdN/master.cfg and this master do not contain the following change in order to pass the error

diff --git a/mozilla-tests/universal_master_sqlite.cfg b/mozilla-tests/universal_master_sqlite.cfg
--- a/mozilla-tests/universal_master_sqlite.cfg
+++ b/mozilla-tests/universal_master_sqlite.cfg
@@ -67,6 +67,8 @@
 # Handle active platforms - Firefox
 all_slave_platforms = []
 for p in ACTIVE_PLATFORMS.keys():
+    if p not in PLATFORMS:
+        continue
     ACTIVE_PLATFORMS[p] = deepcopy(PLATFORMS[p])
     # Handle active slave platforms
     if p in ACTIVE_FX_SLAVE_PLATFORMS:

I think in order to test a new patch needs to be made to include only the above change
You don't need to modify the production masters.json to include your local master. You don't need to modify it not to download the production-masters.json.  The file /builds/buildbot/vlad.ciobancai/test-linux2/buildbot-configs/test-output/sm-vlad.ciobancai-AjWxdN/master.cfg is just created when running the test-masters.sh script.  The real one that is used when you run checkconfig is /builds/buildbot/vlad.ciobancai/test-linux2/master/master.cfg.
After I discussed with kmoir I created the patch to exclude a platform if it's going to be disabled
Attached you can find the output from test-master.sh for the following patch https://bugzilla.mozilla.org/attachment.cgi?id=8724736
I noticed that the linux platform is removed from the project_branches.py file in several cases.

Your patch is to remove linux as a testing platform first, however, removing it from the project_branches.py will remove it as a build platform since this file is shared between both builds and tests

/builds/buildbot/kmoir/test8/buildbot-configs/mozilla-tests
rwxrwxrwx 1 kmoir kmoir 30 Oct 22 17:39 project_branches.py -> ../mozilla/project_branches.py
Uploaded the patch to disable the linux32 tests.
Attachment #8721931 - Attachment is obsolete: true
Attachment #8721963 - Attachment is obsolete: true
Attachment #8724736 - Attachment is obsolete: true
Attachment #8724738 - Attachment is obsolete: true
Attachment #8721963 - Flags: review?(kmoir)
Attachment #8728981 - Flags: review?(kmoir)
Attached file tests-diff.txt
Tests diff file.
Attached file test-masters.sh output
Used a custom production-masters.json where I removed all the "linux"-related entries from the master definitions.
Attachment #8717948 - Attachment is obsolete: true
Attachment #8721964 - Attachment is obsolete: true
Attachment #8722067 - Attachment is obsolete: true
Attachment #8723126 - Attachment is obsolete: true
Comment on attachment 8728985 [details]
test-masters.sh output

If you removed all the linux entries in the production masters.json on your master, won't you need patches to the production one to remove linux so the tests pass in production?
Comment on attachment 8728981 [details] [diff] [review]
[buildbot-configs]disable_tests.patch

This looks good two with two caveats:

1) catlee asked me yesterday to ask again on the dev.planning list for further feedback on implementing this change which I have just done.  So please don't land until we get the go from that
2) The mozilla-tests/BuildSlaves.py.templatewill be need to be landed as a separate patch after we land the changes for mozilla-tests/config.py etc and reconfig.
This is to allow the existing builders to complete and connect to the masters.
After the builders are gone, we can land the puppet patch.
Attachment #8728981 - Flags: feedback+
(In reply to Kim Moir [:kmoir] from comment #64)
> Comment on attachment 8728985 [details]
> test-masters.sh output
> 
> If you removed all the linux entries in the production masters.json on your
> master, won't you need patches to the production one to remove linux so the
> tests pass in production?

Yes, we will need to update the master definitions too. Below is the patch.
I split the previous patch to not include the changes for BuildSlaves.py.template and ran the tests again (all passed).

Note: since the patch includes lots of changes, there's is a good chance that it will become bitrotten pretty soon, so at the moment of landing it we will need to make sure that all the intended changes are there.
Attachment #8728981 - Attachment is obsolete: true
Attachment #8728981 - Flags: review?(kmoir)
Here's the patch for BuildSlaves.py.template, sorry for not paying attention that it needs to be landed separately and thus, asking for review again.
Attachment #8729531 - Flags: review?(kmoir)
Comment on attachment 8729526 [details] [diff] [review]
[tools]update_master_definitions.patch

lgtm, again, please don't land yet as we are still having discussions about this on dev.planning
Attachment #8729526 - Flags: review?(kmoir) → review+
Attachment #8729531 - Flags: review?(kmoir) → review+
Depends on: 1255890
Depends on: 1257518
Assignee: vciobancai → nobody
Blocks: 1217931
Depends on: 1336042
Depends on: 1430027
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: