Closed Bug 975216 Opened 12 years ago Closed 11 years ago

Please update libxcb on the linux test slaves to a more recent release

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

All
Linux
task
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: RyanVM, Assigned: rail)

References

()

Details

Attachments

(6 files, 1 obsolete file)

We've been hitting bug 889869 in our IPC reftests for a long time as a low-frequency crash. However, we are hitting it very frequently since we turned on B2G Desktop reftests. The current theory is that we are hitting a bug in libxcb that is causing a deadlock. Going through the release notes, karlt noticed that there was a fix in one of the 1.8 releases for a problem that looks very similar to ours. From what I can tell, we are currently running version 1.5-1.el6. The current release is version 1.10. Can we please get this package updated so we can see if it fixes these crashes?
Component: General Automation → Platform Support
QA Contact: catlee → coop
Blocks: 889869
No longer blocks: 886869
As was mentioned in bug 889869, the 1.5.1 version is from the CentOS build machines. This is what we have on Ubuntu: ii libxcb-dri2-0 1.8.1-1 X C Binding, dri2 extension ii libxcb-glx0 1.8.1-1 X C Binding, glx extension ii libxcb-glx0:i386 1.8.1-1 X C Binding, glx extension ii libxcb-render0 1.8.1-1 X C Binding, render extension ii libxcb-render0:i386 1.8.1-1 X C Binding, render extension ii libxcb-shape0 1.8.1-1 X C Binding, shape extension ii libxcb-shm0 1.8.1-1 X C Binding, shm extension ii libxcb-shm0:i386 1.8.1-1 X C Binding, shm extension ii libxcb-util0 0.3.8-2 utility libraries for X C Binding -- atom, aux and event ii libxcb1 1.8.1-1 X C Binding ii libxcb1:i386 1.8.1-1 X C Binding Karl mentions 1.8.1-1ubuntu0.1 and 0.2, which I guess are slightly newer because of the Ubuntu patches?
(In reply to Ben Hearsum [:bhearsum] from comment #1) > Karl mentions 1.8.1-1ubuntu0.1 and 0.2, which I guess are slightly newer > because of the Ubuntu patches? Yes, and the Ubuntu patches look like they should address bug 889869. I suggest using one of those packages, because there are existing packages for precise.
Blocks: 975860
Linux32 b2g desktop reftests are hitting this over 50% of the time, so they're now hidden on trunk trees.
Blocks: 784681
dustin: I need some clarification on this howto doc: https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/HowTo/Build_DEBs Can we chat today?
Flags: needinfo?(dustin)
Chatted with dustin on IRC. It seems our documentation doesn't handle this case and the Debian package building process is ridiculously complex. Rail Aliiev may know more. I am going to try and follow http://www.debian.org/doc/manuals/maint-guide/build.en.html#completebuild using http://packages.ubuntu.com/raring/libxcb1 1.8.1-2 which backported a deadlock fix.
Flags: needinfo?(dustin)
We should probably take a newer libxcb1 if possible (1.10 is the latest) as RyanVM pointed out there are various other stability and security fixes as well. rail: I've tried the multi-package build instructions[1] but pbuilder still complains that the required dependencies will not be installed. Could you please investigate? I've basically run out of time for this week. [1] https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/HowTo/Build_DEBs#Old_Stuff
Flags: needinfo?(rail)
Looks like there are still some missing packages. I'd use the following scenario to figure out the dependencies needed upgrade: 1) login to a temp image: sudo pbuilder --login --basetgz /var/cache/pbuilder/base-precise-amd64.tgz 2) for every dependency listed in debian/control as Build-Depends: apt-cache policy $pkg and compare with the requirement. If the version doesn't meet, you will need to backport that dependency as well (what may require the same procedure)
Flags: needinfo?(rail)
I investigated this a little bit. It turns out that there is no need to compile any deps, the issue is the format used in Build-Depends. The version we use doesn't understand "python:any", we should use "python instead. Steps to compile: #download the sources: dget http://archive.ubuntu.com/ubuntu/pool/main/libx/libxcb/libxcb_1.8.1-2ubuntu2.1.dsc # extract dpkg-source -x libxcb_1.8.1-2ubuntu2.1.dsc cd libxcb-1.8.1 # fix python:any sed -i 's/python:any/python/g' debian/control # bump the version DEBFULLNAME="Rail Aliiev" DEBEMAIL="rail@mozilla.com" dch -l mozilla --distribution precise-updates "Backport from raring" # regenerate the sources dpkg-source -b . # build 64-bit + sources sudo pbuilder --build --distribution precise --architecture amd64 --basetgz /var/cache/pbuilder/base-precise-amd64.tgz --buildresult out --debbuildopts "-sa" libxcb_1.8.1-2ubuntu2.1mozilla1.dsc # build 32-bit version sudo pbuilder --build --distribution precise --architecture i386 --basetgz /var/cache/pbuilder/base-precise-i386.tgz --buildresult out32 libxcb_1.8.1-2ubuntu2.1mozilla1.dsc You can find the packages in the out and out32 directories. Then you need to upload them to the releng-updates repo, see https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/Packages#Ubuntu:_Landing_Custom_Repository_Changes Hope it helps :)
Depends on: 978804
Attached file package build failure
Rail: I followed the last in-bug instructions you provided (except to use 1.10 of libxcb) but failed with this log output.
Flags: needinfo?(rail)
this package requires xcb-proto (>= 1.10) while $ apt-cache policy xcb-proto xcb-proto: Installed: 1.7-1 Candidate: 1.7-1 Version table: *** 1.7-1 0 500 http://archive.ubuntu.com/ubuntu/ precise/main amd64 Packages 100 /var/lib/dpkg/status Do you really need 1.10? I thought libxcb_1.8.1-2ubuntu2.1 was enough per comment #2...
Flags: needinfo?(rail)
(In reply to Rail Aliiev [:rail] from comment #10) ... > Do you really need 1.10? I thought libxcb_1.8.1-2ubuntu2.1 was enough per > comment #2... No, 1.10 isn't strictly needed. I've built the packages using libxcb_1.8.1-2ubuntu2.1.dsc and instructions as-is from comment 8. The packages are uploaded here for now: http://people.mozilla.org/~jhopkins/bug975216/out-32/ http://people.mozilla.org/~jhopkins/bug975216/out-64/ We'll need to test that the upgrade succeeds and then roll them out to our internal Ubuntu package repo.
comment 8 should say: DEBFULLNAME="Rail Aliiev" DEBEMAIL="rail@mozilla.com" dch -l mozilla --distribution precise "Backport from raring" Note distribution 'precise' instead of 'precise-updates'. I'll need to rebuild the packages.
Packages rebuilt and upgraded ok from http://people.mozilla.org/~jhopkins/aptrepo/
Attachment #8386855 - Flags: review?(dustin)
Comment on attachment 8386855 [details] [diff] [review] [puppet] upgrade libxcb1 on Ubuntu Review of attachment 8386855 [details] [diff] [review]: ----------------------------------------------------------------- ::: modules/packages/manifests/libxcb1.pp @@ +9,5 @@ > + ensure => latest; > + } > + } > + default: { > + # N/A This should be fail(..) ::: modules/toplevel/manifests/slave/test.pp @@ +6,5 @@ > include vnc > include users::builder::autologin > include ntp::atboot > include packages::fonts > + include packages::libxcb1 And this won't work on OS X test slaves. It should be conditionalized on the OS. modules/talos/manifests/init.pp probably makes more sense for the include, then -- there are a bunch of Ubuntu-specific packages there already.
Attachment #8386855 - Flags: review?(dustin) → review-
Attachment #8386855 - Attachment is obsolete: true
Attachment #8386930 - Flags: review?(dustin)
Attachment #8386930 - Flags: review?(dustin) → review+
Comment on attachment 8386930 [details] [diff] [review] [puppet] upgrade libxcb1 on ubuntu Changed to ensure => "1.8.1-1" to pin to current version first (recommended by rail), ok'd by dustin, and landed in https://hg.mozilla.org/build/puppet/rev/0edcfe6798bc
Attachment #8386930 - Flags: checked-in+
Attachment #8387055 - Flags: review?(dustin) → review?(rail)
Attachment #8387055 - Flags: review?(rail) → review+
Comment on attachment 8387055 [details] [diff] [review] [puppet] upgrade libxcb1 to 1.8.1-2ubuntu2.1mozilla1 Landed in http://hg.mozilla.org/build/puppet/rev/cfee8be1d903 and merged to production.
Attachment #8387055 - Flags: checked-in+
RyanVM: package upgrades are looking good. Can you please verify that it had the intended effect?
Flags: needinfo?(ryanvm)
Will keep an eye on things over the next day or so. If this worked, we should see a reduction in Desktop B2G reftest orange (and actually, I think mochitest too - lots of timeouts there have libxcb on the stack too).
Flags: needinfo?(ryanvm)
It did have the desired effect (in fact, far far more than the expected effect), thanks!
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Reopening due to bug 991274
Blocks: 991274
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached patch libxcb0.diffSplinter Review
This is a temporary workaround to make the package list match... :(
Attachment #8400889 - Flags: review?(dustin)
Attachment #8400889 - Flags: review?(dustin) → review+
Attachment #8400889 - Flags: checked-in+ → checked-in-
I'm going to disable spot AMI generation script to prevent tree closures possible tomorrow.
Attachment #8400909 - Flags: review?(dustin)
Attachment #8400909 - Flags: review?(dustin) → review+
Blocks: 934827
Assignee: nobody → rail
Puppet cannot handle multiple packages in a single transaction and these packages need to be installed in a single command. Otherwise apt removes some dependant packages (ialib32 and some other ones). It would be great to have something like multipkg (http://probablyfine.co.uk/2014/03/21/faster-puppet-package-installation-with-multipkg/) but with full set of "package" options (at least ensure => version). I tested the the patch and the difference between spot (old) and on-demand (new) looks like: --- /tmp/pkg_spot_new 2014-04-07 13:00:06.748873194 -0400 +++ /tmp/pkg_ondemand_new 2014-04-07 13:02:23.900876859 -0400 @@ -326,7 +326,7 @@ ii gzip 1.4-1ubuntu2 GNU compression utilities ii hdparm 9.37-0ubuntu3 tune hard disk parameters for high performance ii hicolor-icon-theme 0.12-1ubuntu2 default fallback theme for FreeDesktop.org icon themes -ii hiera 1.2.1-1puppetlabs1 A simple pluggable Hierarchical Database. +ii hiera 1.3.2-1puppetlabs1 A simple pluggable Hierarchical Database. ii hostname 3.06ubuntu1 utility to set/show the host name or domain name ii hplip 3.12.2-1ubuntu3 HP Linux Printing and Imaging System (HPLIP) ii hplip-data 3.12.2-1ubuntu3 HP Linux Printing and Imaging - data files
Attachment #8402761 - Flags: review?(dustin)
Comment on attachment 8402761 [details] [diff] [review] libxcb-puppet-1.diff Once this is in place on all affected hosts, can we revert this change?
Attachment #8402761 - Flags: review?(dustin) → review+
(In reply to Dustin J. Mitchell [:dustin] from comment #30) > Comment on attachment 8402761 [details] [diff] [review] > libxcb-puppet-1.diff > > Once this is in place on all affected hosts, can we revert this change? What if we decide to add more on-demand instances or recreate them? We will the same issue again. :/
Oh, so they're still using an incorrect base image? Once *that* is fixed, we could revert this?
The do use the same base image. The problem is applying the current puppet manifests against a bare base image and against previously puppetized image.
I'm going to revert the patch since downgrading the packages didn't really help. See https://bugzilla.mozilla.org/show_bug.cgi?id=991274#c27
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
No longer blocks: 934827
No longer blocks: 889869
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: