Closed Bug 975216 Opened 8 years ago Closed 8 years ago

Please update libxcb on the linux test slaves to a more recent release

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

All
Linux
task
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: RyanVM, Assigned: rail)

References

()

Details

Attachments

(6 files, 1 obsolete file)

We've been hitting bug 889869 in our IPC reftests for a long time as a low-frequency crash. However, we are hitting it very frequently since we turned on B2G Desktop reftests.

The current theory is that we are hitting a bug in libxcb that is causing a deadlock. Going through the release notes, karlt noticed that there was a fix in one of the 1.8 releases for a problem that looks very similar to ours.

From what I can tell, we are currently running version 1.5-1.el6. The current release is version 1.10. Can we please get this package updated so we can see if it fixes these crashes?
Component: General Automation → Platform Support
QA Contact: catlee → coop
Blocks: 889869
No longer blocks: 886869
As was mentioned in bug 889869, the 1.5.1 version is from the CentOS build machines. This is what we have on Ubuntu:
ii  libxcb-dri2-0                          1.8.1-1                                 X C Binding, dri2 extension
ii  libxcb-glx0                            1.8.1-1                                 X C Binding, glx extension
ii  libxcb-glx0:i386                       1.8.1-1                                 X C Binding, glx extension
ii  libxcb-render0                         1.8.1-1                                 X C Binding, render extension
ii  libxcb-render0:i386                    1.8.1-1                                 X C Binding, render extension
ii  libxcb-shape0                          1.8.1-1                                 X C Binding, shape extension
ii  libxcb-shm0                            1.8.1-1                                 X C Binding, shm extension
ii  libxcb-shm0:i386                       1.8.1-1                                 X C Binding, shm extension
ii  libxcb-util0                           0.3.8-2                                 utility libraries for X C Binding -- atom, aux and event
ii  libxcb1                                1.8.1-1                                 X C Binding
ii  libxcb1:i386                           1.8.1-1                                 X C Binding


Karl mentions 1.8.1-1ubuntu0.1 and 0.2, which I guess are slightly newer because of the Ubuntu patches?
(In reply to Ben Hearsum [:bhearsum] from comment #1)
> Karl mentions 1.8.1-1ubuntu0.1 and 0.2, which I guess are slightly newer
> because of the Ubuntu patches?

Yes, and the Ubuntu patches look like they should address bug 889869.  I suggest using one of those packages, because there are existing packages for precise.
Blocks: 975860
Linux32 b2g desktop reftests are hitting this over 50% of the time, so they're now hidden on trunk trees.
Blocks: 784681
dustin: I need some clarification on this howto doc: https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/HowTo/Build_DEBs   Can we chat today?
Flags: needinfo?(dustin)
Chatted with dustin on IRC.  It seems our documentation doesn't handle this case and the Debian package building process is ridiculously complex.  Rail Aliiev may know more.  I am going to try and follow http://www.debian.org/doc/manuals/maint-guide/build.en.html#completebuild using http://packages.ubuntu.com/raring/libxcb1 1.8.1-2 which backported a deadlock fix.
Flags: needinfo?(dustin)
We should probably take a newer libxcb1 if possible (1.10 is the latest) as RyanVM pointed out there are various other stability and security fixes as well.

rail: I've tried the multi-package build instructions[1] but pbuilder still complains that the required dependencies will not be installed.  Could you please investigate?  I've basically run out of time for this week.


[1] https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/HowTo/Build_DEBs#Old_Stuff
Flags: needinfo?(rail)
Looks like there are still some missing packages.

I'd use the following scenario to figure out the dependencies needed upgrade:

1) login to a temp image:
 sudo pbuilder --login --basetgz /var/cache/pbuilder/base-precise-amd64.tgz

2) for every dependency listed in debian/control as Build-Depends:
 apt-cache policy $pkg

and compare with the requirement. If the version doesn't meet, you will need to backport that dependency as well (what may require the same procedure)
Flags: needinfo?(rail)
I investigated this a little bit. It turns out that there is no need to compile any deps, the issue is the format used in Build-Depends. The version we use doesn't understand "python:any", we should use "python instead.

Steps to compile:

#download the sources:
dget http://archive.ubuntu.com/ubuntu/pool/main/libx/libxcb/libxcb_1.8.1-2ubuntu2.1.dsc

# extract
dpkg-source -x libxcb_1.8.1-2ubuntu2.1.dsc 

cd libxcb-1.8.1

# fix python:any
sed -i 's/python:any/python/g' debian/control

# bump the version
DEBFULLNAME="Rail Aliiev" DEBEMAIL="rail@mozilla.com" dch -l mozilla --distribution precise-updates "Backport from raring"

# regenerate the sources
dpkg-source -b .

# build 64-bit + sources
sudo pbuilder --build --distribution precise --architecture amd64 --basetgz /var/cache/pbuilder/base-precise-amd64.tgz --buildresult out --debbuildopts "-sa" libxcb_1.8.1-2ubuntu2.1mozilla1.dsc

# build 32-bit version
sudo pbuilder --build --distribution precise --architecture i386 --basetgz /var/cache/pbuilder/base-precise-i386.tgz  --buildresult out32 libxcb_1.8.1-2ubuntu2.1mozilla1.dsc

You can find the packages in the out and out32 directories. Then you need to upload them to the releng-updates repo, see https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/Packages#Ubuntu:_Landing_Custom_Repository_Changes

Hope it helps :)
Depends on: 978804
Attached file package build failure
Rail: I followed the last in-bug instructions you provided (except to use 1.10 of libxcb) but failed with this log output.
Flags: needinfo?(rail)
this package requires xcb-proto (>= 1.10) while 

$ apt-cache policy xcb-proto
xcb-proto:
  Installed: 1.7-1
  Candidate: 1.7-1
  Version table:
 *** 1.7-1 0
        500 http://archive.ubuntu.com/ubuntu/ precise/main amd64 Packages
        100 /var/lib/dpkg/status

Do you really need 1.10? I thought libxcb_1.8.1-2ubuntu2.1 was enough per comment #2...
Flags: needinfo?(rail)
(In reply to Rail Aliiev [:rail] from comment #10)
...
> Do you really need 1.10? I thought libxcb_1.8.1-2ubuntu2.1 was enough per
> comment #2...

No, 1.10 isn't strictly needed.

I've built the packages using libxcb_1.8.1-2ubuntu2.1.dsc and instructions as-is from comment 8.
The packages are uploaded here for now:

 http://people.mozilla.org/~jhopkins/bug975216/out-32/
 http://people.mozilla.org/~jhopkins/bug975216/out-64/

We'll need to test that the upgrade succeeds and then roll them out to our internal Ubuntu package repo.
comment 8 should say:

DEBFULLNAME="Rail Aliiev" DEBEMAIL="rail@mozilla.com" dch -l mozilla --distribution precise "Backport from raring"

Note distribution 'precise' instead of 'precise-updates'.  I'll need to rebuild the packages.
Packages rebuilt and upgraded ok from http://people.mozilla.org/~jhopkins/aptrepo/
Attachment #8386855 - Flags: review?(dustin)
Comment on attachment 8386855 [details] [diff] [review]
[puppet] upgrade libxcb1 on Ubuntu

Review of attachment 8386855 [details] [diff] [review]:
-----------------------------------------------------------------

::: modules/packages/manifests/libxcb1.pp
@@ +9,5 @@
> +                    ensure => latest;
> +            }
> +        }
> +        default: {
> +            # N/A

This should be fail(..)

::: modules/toplevel/manifests/slave/test.pp
@@ +6,5 @@
>      include vnc
>      include users::builder::autologin
>      include ntp::atboot
>      include packages::fonts
> +    include packages::libxcb1

And this won't work on OS X test slaves.  It should be conditionalized on the OS.  modules/talos/manifests/init.pp probably makes more sense for the include, then -- there are a bunch of Ubuntu-specific packages there already.
Attachment #8386855 - Flags: review?(dustin) → review-
Attachment #8386855 - Attachment is obsolete: true
Attachment #8386930 - Flags: review?(dustin)
Attachment #8386930 - Flags: review?(dustin) → review+
Comment on attachment 8386930 [details] [diff] [review]
[puppet] upgrade libxcb1 on ubuntu

Changed to ensure => "1.8.1-1" to pin to current version first (recommended by rail), ok'd by dustin, and landed in https://hg.mozilla.org/build/puppet/rev/0edcfe6798bc
Attachment #8386930 - Flags: checked-in+
Attachment #8387055 - Flags: review?(dustin) → review?(rail)
Attachment #8387055 - Flags: review?(rail) → review+
Comment on attachment 8387055 [details] [diff] [review]
[puppet] upgrade libxcb1 to 1.8.1-2ubuntu2.1mozilla1

Landed in http://hg.mozilla.org/build/puppet/rev/cfee8be1d903 and merged to production.
Attachment #8387055 - Flags: checked-in+
RyanVM: package upgrades are looking good.  Can you please verify that it had the intended effect?
Flags: needinfo?(ryanvm)
Will keep an eye on things over the next day or so. If this worked, we should see a reduction in Desktop B2G reftest orange (and actually, I think mochitest too - lots of timeouts there have libxcb on the stack too).
Flags: needinfo?(ryanvm)
It did have the desired effect (in fact, far far more than the expected effect), thanks!
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Reopening due to bug 991274
Blocks: 991274
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached patch libxcb0.diffSplinter Review
This is a temporary workaround to make the package list match... :(
Attachment #8400889 - Flags: review?(dustin)
Attachment #8400889 - Flags: review?(dustin) → review+
Comment on attachment 8400889 [details] [diff] [review]
libxcb0.diff

remote:   https://hg.mozilla.org/build/puppet/rev/f519a9f4be88
remote:   https://hg.mozilla.org/build/puppet/rev/d73538458440

this tries to remove ia32-libs :/
Attachment #8400889 - Flags: checked-in+ → checked-in-
I'm going to disable spot AMI generation script to prevent tree closures possible tomorrow.
Attachment #8400909 - Flags: review?(dustin)
Attachment #8400909 - Flags: review?(dustin) → review+
Blocks: 934827
Assignee: nobody → rail
Puppet cannot handle multiple packages in a single transaction and these packages need to be installed in a single command. Otherwise apt removes some dependant packages (ialib32 and some other ones).

It would be great to have something like multipkg (http://probablyfine.co.uk/2014/03/21/faster-puppet-package-installation-with-multipkg/) but with full set of "package" options (at least ensure => version).

I tested the the patch and the difference between spot (old) and on-demand (new) looks like:

--- /tmp/pkg_spot_new	2014-04-07 13:00:06.748873194 -0400
+++ /tmp/pkg_ondemand_new	2014-04-07 13:02:23.900876859 -0400
@@ -326,7 +326,7 @@
 ii  gzip                                   1.4-1ubuntu2                            GNU compression utilities
 ii  hdparm                                 9.37-0ubuntu3                           tune hard disk parameters for high performance
 ii  hicolor-icon-theme                     0.12-1ubuntu2                           default fallback theme for FreeDesktop.org icon themes
-ii  hiera                                  1.2.1-1puppetlabs1                      A simple pluggable Hierarchical Database.
+ii  hiera                                  1.3.2-1puppetlabs1                      A simple pluggable Hierarchical Database.
 ii  hostname                               3.06ubuntu1                             utility to set/show the host name or domain name
 ii  hplip                                  3.12.2-1ubuntu3                         HP Linux Printing and Imaging System (HPLIP)
 ii  hplip-data                             3.12.2-1ubuntu3                         HP Linux Printing and Imaging - data files
Attachment #8402761 - Flags: review?(dustin)
Comment on attachment 8402761 [details] [diff] [review]
libxcb-puppet-1.diff

Once this is in place on all affected hosts, can we revert this change?
Attachment #8402761 - Flags: review?(dustin) → review+
(In reply to Dustin J. Mitchell [:dustin] from comment #30)
> Comment on attachment 8402761 [details] [diff] [review]
> libxcb-puppet-1.diff
> 
> Once this is in place on all affected hosts, can we revert this change?

What if we decide to add more on-demand instances or recreate them? We will the same issue again. :/
Oh, so they're still using an incorrect base image?  Once *that* is fixed, we could revert this?
The do use the same base image. The problem is applying the current puppet manifests against a bare base image and against previously puppetized image.
I'm going to revert the patch since downgrading the packages didn't really help.

See https://bugzilla.mozilla.org/show_bug.cgi?id=991274#c27
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
No longer blocks: 934827
Duplicate of this bug: 934827
No longer blocks: 889869
Duplicate of this bug: 889869
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.