[amo] Upgrade to Python 2.6.4 or 2.6.5

RESOLVED FIXED

Status

--
blocker
RESOLVED FIXED
8 years ago
5 years ago

People

(Reporter: jbalogh, Assigned: fox2mike)

Tracking

Details

(Reporter)

Description

8 years ago
We'll have to create new virtualenvs when we upgrade the python version.

I have 2.6.4 locally.  I installed 2.6.5 on khan.  I cannot reproduce bug 560381 on either.

Khan has 2.6.2 system-wide.  I'm guessing (hoping) that preview has the same.

This may affect bug 557941 and bug 559085.

We want to get this installed way before 5.10 rolls out on 5/4.  It hurts our load testing when Python is crashing in the middle of a request.

I couldn't find anything in http://python.org/download/releases/2.6.5/NEWS.txt suggesting what the problem was.
(Reporter)

Updated

8 years ago
Blocks: 560381
(Reporter)

Updated

8 years ago
Blocks: 557941, 559085
Upping this to critical.  If preview is not on 2.6.2 feel free to reduce.
Severity: normal → critical
(Assignee)

Comment 2

8 years ago
Preview is on 2.6.2

[root@pm-app-amo24 ~]# rpm -qi python26
Name        : python26                     Relocations: (not relocatable)
Version     : 2.6.2                             Vendor: (none)
(Assignee)

Comment 3

8 years ago
Over to Jeremy, guess he built these the last time.
Assignee: server-ops → jeremy.orem+bugs
I don't think these page after they're assigned so upping this to blocker is mostly for show, but this is blocking finishing load testing, which is blocking our 5.9.1 push to next.amo, and blocking QA on preview with 500 errors.  This is important.
Severity: critical → blocker
Blocks: 559374
(Assignee)

Comment 5

8 years ago
Grabbing this back, I did the rpms the last time it seems, think I've managed to update them successfully.
Assignee: jeremy.orem+bugs → shyam
(Assignee)

Comment 6

8 years ago
Updated:
  python26.i386 0:2.6.5-geekymedia1.1.rhel5                                                                                                                                                               

Dependency Updated:
  python26-devel.i386 0:2.6.5-geekymedia1.1.rhel5    python26-libs.i386 0:2.6.5-geekymedia1.1.rhel5    python26-tools.i386 0:2.6.5-geekymedia1.1.rhel5    tkinter26.i386 0:2.6.5-geekymedia1.1.rhel5   

Complete!

[root@pm-app-amo24 ~]# python26
Python 2.6.5 (r265:79063, Apr 20 2010, 23:45:36) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-46)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>

Preview is now on 2.6.5 and you should be unblocked.

I'm having some issues building the x86_64 rpm, looking into that before I close the bug.
Status: NEW → ASSIGNED
(Assignee)

Comment 7

8 years ago
Figured that out, thanks to Dave and we now have x86_64 and i686 rpms for python-2.6.5

Jeff, how do I rebuild the virtualenvs? Also, how did you install 2.6.5 on khan? from source? 

Leaving this open till the virtualenvs are rebuilt.
(In reply to comment #7)
> Jeff, how do I rebuild the virtualenvs? Also, how did you install 2.6.5 on
> khan? from source? 

2.6.5 isn't on khan yet, we used your RPMs for it's current version and should do the same this time.  Please upgrade it too.


And thanks for your help!
(Assignee)

Comment 9

8 years ago
Done.

[root@khan ~]# python26 
Python 2.6.5 (r265:79063, Apr 20 2010, 23:45:36) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-46)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
Will you put the src rpm in dm-nagios01:/mnt/packages/mrepo-src/5Server-SRPMS/.
(Reporter)

Comment 11

8 years ago
To rebuild virtualenvs:

1. virtualenv --python=/path/to/python26 /path/to/new/virtualenv
2. Run the `pip install` command from the update script to get this started.
3. rm -rf /path/to/old/virtualenv
4. ln -s /path/to/new/virtualenv /path/to/old/virtualenv

Now we'll have a new virtualenv on the same path as the old one.  The site will be broken in between 3 & 4, so any ideas on how to avoid that are welcome.  But it should be ok to take the site down for a couple minutes when we run this on prod.
(Assignee)

Comment 12

8 years ago
(In reply to comment #10)
> Will you put the src rpm in dm-nagios01:/mnt/packages/mrepo-src/5Server-SRPMS/.

Done.
This didn't fix the bug we were hoping it would fix.  Are you doing anything special when you are building the package?

If I use your new new khan package I still get the crashes.

If I use a package of 2.6.5 built on khan (just a ./configure; make) I no longer get the crashes.
(Assignee)

Comment 14

8 years ago
Well, I honestly don't know. Left to my choice, I wouldn't run RHEL for bleeding edge stuff. But we need support on hardware and bleeding edge stuff, which is kind of like having the cake and eating it too. 

What is it crashing on? Maybe that'll give us a clue? I'm really shooting in the dark here.
Heh, get a load of this:

python: Modules/gcmodule.c:262: update_refs: Assertion `gc->gc.gc_refs != 0' failed.

The source for that has a huge scary comment above it.  oremj straced it and it was crashing all over the code, no where in particular.
(Assignee)

Comment 16

8 years ago
Sigh. :|

Jeremy, any ideas?

This rpm build python for RHEL is a hack in itself, since it allows multiple versions of python to exist. RHEL 6 beta supports python 2.6.2 by default, not sure if that might be something worth trying, but then we won't really take the beta to production.
Hudson started failing with this today, so now we can't run tests
(Assignee)

Comment 18

8 years ago
Failing with what? these random python crashes? What do you propose we do to solve this?

IMHO, this RHEL + Python is a hackery. It always was, always will be :)

If you want everything to be built from source, I'd defer to Jeremy and Dave, they were the ones for the rpm in the first place, they have their reasons and I'd agree with them if this rpm wasn't this much of a hack.
(In reply to comment #18)
> Failing with what? these random python crashes? What do you propose we do to
> solve this?
Yes, with the crash in comment 15.

> IMHO, this RHEL + Python is a hackery. It always was, always will be :)
> 
> If you want everything to be built from source, I'd defer to Jeremy and Dave,
> they were the ones for the rpm in the first place, they have their reasons and
> I'd agree with them if this rpm wasn't this much of a hack.

I don't know enough about packaging to comment here.  Building it from source on khan works, could we build it from source somewhere, mush that into a package and distribute to our boxes?
(Reporter)

Comment 20

8 years ago
If you guys want to `./configure && make install` our own Python, I think this problem will go away.  I've only seen this bug with the RHEL package, and I couldn't reproduce it with the Python I built on khan.  This is really not something I want to dive into.

But if it comes to that I can reproduce our bug at will, but the current Python package is a black box.

Seeing the patches that are being applied to Python might help.

A debug build of Python installed with this package should let us look at what's going on inside Python when this crashes.

Someone filed a redhat bug about a similar problem: https://bugzilla.redhat.com/show_bug.cgi?id=573156
(Assignee)

Comment 21

8 years ago
(In reply to comment #20)
 
> Seeing the patches that are being applied to Python might help.

How can I get these to you? homedir on khan? 
 
> A debug build of Python installed with this package should let us look at
> what's going on inside Python when this crashes.

Updated:
  python26-debuginfo.i386 0:2.6.5-geekymedia1.1.rhel5

This is on preview. Let me know if that helps you find out what's going on.
(In reply to comment #13)
> If I use a package of 2.6.5 built on khan (just a ./configure; make) I no
> longer get the crashes.

So I just tried this, and it complains that autoconf is too old.  Which is probably why the rpm has a patch changing the autoconf requirement.  Did you install a newer autoconf on khan or something?
(In reply to comment #22)
> (In reply to comment #13)
> > If I use a package of 2.6.5 built on khan (just a ./configure; make) I no
> > longer get the crashes.
> 
> So I just tried this, and it complains that autoconf is too old.  Which is
> probably why the rpm has a patch changing the autoconf requirement.  Did you
> install a newer autoconf on khan or something?

nevermind, the answer to that is don't run autoconf first. :)
ok, I rebuilt the rpm with the majority of the RH patches removed, and installed on khan.  Care to give that a try?
(In reply to comment #24)
> ok, I rebuilt the rpm with the majority of the RH patches removed, and
> installed on khan.  Care to give that a try?

Is this /usr/bin/python26 ?
(Reporter)

Comment 26

8 years ago
(In reply to comment #21)
> (In reply to comment #20)
> 
> > Seeing the patches that are being applied to Python might help.
> 
> How can I get these to you? homedir on khan? 

Sure.  Just let me know where you drop it on khan and I'll find it.
(In reply to comment #25)
> (In reply to comment #24)
> > ok, I rebuilt the rpm with the majority of the RH patches removed, and
> > installed on khan.  Care to give that a try?
> 
> Is this /usr/bin/python26 ?

/usr/bin/python26 was touched last night so I think that's the one.  I made a new virtualenv using it and still have the same crashes.
(Assignee)

Comment 28

8 years ago
(In reply to comment #26)

> Sure.  Just let me know where you drop it on khan and I'll find it.

Kind of moot, since dave rebuilt everything without most of the patches? Dave, what were the patches you left in?
(Reporter)

Comment 29

8 years ago
Thanks for the help guys.  I figured out what was going on in bug 560381 comment 2.  Having the rpm sources to build and edit Python did the trick.

I don't know if the crash is happening due to something that Redhat is changing, but right now I'm content with working around the problem in our code.  Eventually I'll work out a smaller test case and figure out if this belongs in the Redhat or Python bug tracker.

We can use whatever Python 2.6.x version you're comfortable with, though I tested and fixed the bug on justdave's new 2.6.5 sources.
Status: ASSIGNED → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
Excellent. What was the fix?
(Reporter)

Comment 31

8 years ago
<oremj> jbalogh: what was the fix for the python crash?
<jbalogh> an exception was being raised inside a lambda
<jbalogh> and the traceback for the exception wasn't getting refcounted properly
<jbalogh> so the fix is to avoid raising that exception
<oremj> interesting
<jbalogh> we're going to monkeypatch jinja to get around it
<oremj> so you can reproduce reliably by raising an exception in a lambda?
<jbalogh> I haven't tried that yet
<oremj> or is it still somewhat random?
<jbalogh> http://pastie.org/936661
<jbalogh> that one does fine
<jbalogh> it's something more intricate, I guess
<jbalogh> switching the bad jinja code to use def __getitem__ instead of __getitem__ = lambda: fixed the problem though
<jbalogh> but it only happened when interacting with cache-machine
<jbalogh> so I'm still a bit stumped
<oremj> very strange
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.