Closed Bug 620115 Opened 14 years ago Closed 14 years ago

No SeaMonkey nightly builds generated any more due to sqlite/python threading issue.

Categories

(Release Engineering :: General, defect)

defect
Not set
major

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: rsx11m.pub, Assigned: dustin)

References

Details

Attachments

(3 files, 1 obsolete file)

The last regular nightly branch builds were 20101215 for Mac and Linux, the last one for Windows 20101216. This seems to be unrelated to the buildbot 0.8 upgrade time frame, but coincides with Thunderbird switching off their 3.0.x tinderboxes on December 15. There was a push on mozilla-1.9.1, triggering the build/test boxes, but there are no corresponding 20101218 nightly builds and no indication that they were actually started.
It coincides with turning on the new nightly build code for SeaMonkey, which uses the same build ID and revision for nightlies of all platforms. I'm not sure yet why we don't build any nightlies at all right now with this, but it may be connected with the fact that we don't have succeeding normal builds on all platforms right now.
This caught my attention because there was no log for any nightly at all. Even if it didn't succeed, I wouldn't expect a nightly, but still a build log. :-\
Depends on: 619917
Hrm, the problem actually seems to lie deeper (we now seem to see it on trunk as well) and go into threading problems with SQLite: 2010-12-19 00:00:01-0800 [-] lastGoodRev: took 0.62 seconds to run; returned None 2010-12-19 00:00:01-0800 [-] Unhandled error in Deferred: 2010-12-19 00:00:01-0800 [-] Unhandled Error Traceback (most recent call last): File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/Twisted-9.0.0-py2.6-linux-i686.egg/twisted/enterprise/adbapi.py", line 429, in _runInteraction result = interaction(trans, *args, **kw) File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/buildbot-0.8.2_hg_161d0a80925c_default-py2.6.egg/buildbot/schedulers/timed.py", line 249, in _check_timer self._maybe_start_build(t) File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/buildbot-0.8.2_hg_161d0a80925c_default-py2.6.egg/buildbot/schedulers/timed.py", line 277, in _maybe_start_build self.start_HEAD_build(t) File "/tools/buildbotcustom/buildbotcustom/scheduler.py", line 65, in start_HEAD_build d = defer.maybeDeferred(self.ssFunc, self, t) --- <exception caught here> --- File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/Twisted-9.0.0-py2.6-linux-i686.egg/twisted/internet/defer.py", line 102, in maybeDeferred result = f(*args, **kw) File "/tools/buildbotcustom/buildbotcustom/misc_scheduler.py", line 249, in ssFunc rev = lastChangeset(db, branch) File "/tools/buildbotcustom/buildbotcustom/misc_scheduler.py", line 111, in lastChangeset for c in db.changeEventGenerator(branches=[branch]): File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/buildbot-0.8.2_hg_161d0a80925c_default-py2.6.egg/buildbot/db/connector.py", line 364, in changeEventGenerator rows = self.runQueryNow(q, tuple(args)) File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/buildbot-0.8.2_hg_161d0a80925c_default-py2.6.egg/buildbot/db/connector.py", line 182, in runQueryNow return self.runInteractionNow(self._runQuery, *args, **kwargs) File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/buildbot-0.8.2_hg_161d0a80925c_default-py2.6.egg/buildbot/db/connector.py", line 212, in runInteractionNow return self._runInteractionNow(interaction, *args, **kwargs) File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/buildbot-0.8.2_hg_161d0a80925c_default-py2.6.egg/buildbot/db/connector.py", line 235, in _runInteractionNow c = conn.cursor() File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/buildbot-0.8.2_hg_161d0a80925c_default-py2.6.egg/buildbot/db/dbspec.py", line 131, in cursor return RetryingCursor(self.dbapi, self.conn.cursor()) sqlite3.ProgrammingError: SQLite objects created in a thread can only be used in that same thread.The object was created in thread id -1208527168 and this is thread id -1212355696
Blocks: 619072
I'm a bit at a loss here, as I neither have a clue of how to fix this nor on how to switch us to a different DB than sqlite to perhaps fix that. Chris or Ben, can you help me with either of those two sides here?
Blocks: 620606
No longer blocks: 620606
Severity: major → blocker
Summary: No SeaMonkey 2.0.12pre nightly builds since December 15/16 → No SeaMonkey nightly builds generated any more due to sqlite/python threading issue.
Version: SeaMonkey 2.0 Branch → unspecified
It looks like something is going wrong in how the custom schedulers are written. There's a lot of trickiness there to make sure that appropriate transactions run in appropriate threads. In particular, start_HEAD_build is called in a thread with a transaction object, so using maybeDeferred there is a mistake. I think the error here is calling db.changeEventGenerator, which assumes it's run in the main thread and thus uses the main thread's sqlite connection. We haven't seen this in RelEng production because MySQL does not have this one-object-per-thread limitation. I'll see if I can work up a patch to the custom nightly stuff.
Assignee: nobody → dustin
Attached patch m620115-buildbotcustom-r1.patch (obsolete) — Splinter Review
This patch attempts to thread the 't' parameter (the transaction) through all of the methods that start_HEAD_build invokes. The sticking point was generateChangeEvents (which is going to die in 0.8.4, btw), so I just copied that function into buildbotcustom and modified it. I don't have tuxedo on my system to be able to test this, and in general I'm too new to this stuff to have a quick way to stage changes like this, so I've flagged it for feedback rather than review. I can try (later) to stage it in the releng environment, but since things work in our environment *without* this patch, I'm not sure that would be conclusive. Let me know what you think.
Attachment #499042 - Flags: feedback?
Dustin, thanks a lot for looking into this. The patch has been locally applied to the SeaMonkey buildmaster, checkconfig liked it, let's see what the next nightly cycle tells us.
Severity: blocker → major
Version: unspecified → SeaMonkey 2.0 Branch
Dustin, it looks like there's still a problem in there: 2010-12-22 00:00:53-0800 [-] lastGoodRev: took 0.04 seconds to run; returned None 2010-12-22 00:00:53-0800 [-] Unhandled Error Traceback (most recent call last): File "/tools/python-2.6.5/lib/python2.6/threading.py", line 504, in __bootstrap self.__bootstrap_inner() File "/tools/python-2.6.5/lib/python2.6/threading.py", line 532, in __bootstrap_inner self.run() File "/tools/python-2.6.5/lib/python2.6/threading.py", line 484, in run self.__target(*self.__args, **self.__kwargs) --- <exception caught here> --- -- File "/tools/buildbotcustom/buildbotcustom/scheduler.py", line 69, in start_HEAD_build if ss is None: File "/tools/buildbotcustom/buildbotcustom/misc_scheduler.py", line 277, in ssFunc File "/tools/buildbotcustom/buildbotcustom/misc_scheduler.py", line 138, in lastChangeset sourcestamps.revision IS NOT NULL AND exceptions.NameError: global name 'changeEventGeneratorWithTransaction' is not defined Looks to me like just a typo, as you actually named it changeEventGeneratorInTransaction - am I right in that should fix this?
(In reply to comment #9) > Looks to me like just a typo, as you actually named it > changeEventGeneratorInTransaction - am I right in that should fix this? I guess it somehow kept retrying, because I saw that error repeated, and when I fixed that line and did a reconfig, it apparently triggered nightlies correctly.
I figured there was a typo lurking there somewhere. I'd like to get catlee's review, too, before committing this, since he wrote the original implementation. Do you mind letting this "bake" until he's back on Jan 4?
Attachment #499042 - Attachment is obsolete: true
Attachment #499293 - Flags: superreview?
Attachment #499293 - Flags: review?(catlee)
Attachment #499042 - Flags: feedback?
Comment on attachment 499293 [details] [diff] [review] m620115-buildbotcustom-r2.patch [applied SM] Ben, if you have time and can take a look before catlee's back, that would be great, too.
Attachment #499293 - Flags: superreview? → review?(bhearsum)
(In reply to comment #8) > Dustin, thanks a lot for looking into this. The patch has been locally applied > to the SeaMonkey buildmaster, checkconfig liked it, let's see what the next > nightly cycle tells us. Nightly builds have appeared today for all platforms in both the "latest-comm-central-trunk" (SeaMonkey 2.1b2pre) and "latest-comm-1.9.1" (SeaMonkey 2.0.12pre) ftp directories.
oh, and also in the corresponding l10n directories, for all languages except comm-1.9.1 Polish Linux i686 and comm-central Win32. These exceptions don't affect the langpack.xpi addons, which were generated on some other platform.
(In reply to comment #11) > Do you mind letting this "bake" until he's back > on Jan 4? Sure, no problem with that, I'm happy we have that patch running there - again, thanks a ton for that!
Comment on attachment 499293 [details] [diff] [review] m620115-buildbotcustom-r2.patch [applied SM] >diff --git a/scheduler.py b/scheduler.py >--- a/scheduler.py >+++ b/scheduler.py >@@ -58,23 +58,25 @@ class SpecificNightly(Nightly): > Nightly.__init__(self, *args, **kwargs) > > def start_HEAD_build(self, t): >- """Slightly mis-named, but this function is called when it's time to start a build. We call our ssFunc to get a sourcestamp to build. >+ """ >+ Slightly mis-named, but this function is called when it's time to start >+ a build. We call our ssFunc to get a sourcestamp to build. > >- ssFunc is called in a thread with an active database transaction running. >+ ssFunc is called in a thread with an active database transaction >+ running. It cannot use Deferreds, nor any db.*Now methods. > """ >+ #### NOTE: called in a thread! >+ ss = self.ssFunc(self, t) > d = defer.maybeDeferred(self.ssFunc, self, t) I think you want to delete this line now, right? Looks good otherwise.
Attachment #499293 - Flags: review?(catlee) → review+
In particular, you mean this line, right? > d = defer.maybeDeferred(self.ssFunc, self, t)
(In reply to comment #17) > In particular, you mean this line, right? > > > d = defer.maybeDeferred(self.ssFunc, self, t) yes
landed in build/buildbotcustom branch 'default': 6d5e6ca97a95
Attachment #499293 - Attachment is obsolete: true
Attachment #499293 - Flags: review?(bhearsum)
moving this into the firefox releng product so we can flag it for deployment
Product: SeaMonkey → mozilla.org
Version: SeaMonkey 2.0 Branch → other
No longer blocks: 619072
Needs a buildmaster restart. This isn't blocking anything, so this can happen during the next otherwise-scheduled downtime.
Flags: needs-treeclosure?
Well, this got landed as a ridealong with other changes, but unfortunately: 130 q += " ORDER BY changeid DESC" 131 rows = t.execute(q, tuple(args)) 132 for (changeid,) in rows: 133 yield dbconn._txn_getChangeNumberedNow(t, changeid) t.execute does not return a list of rows.
Flags: needs-treeclosure?
untested fix
Attachment #502402 - Flags: review?(nrthomas)
Comment on attachment 502402 [details] [diff] [review] m620115b-buildbotcustom-r1.patch Soryr, you need a catlee-like object for this review.
Attachment #502402 - Flags: review?(nrthomas)
apparently the world of staging masters has moved on without me since I last staged a change, as I can't figure out how to run this in staging. I'm getting Traceback (most recent call last): File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-i686.egg/twisted/scripts/_twistd_unix.py", line 317, in startApplication app.startApplication(application, not self.config['no_save']) File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-i686.egg/twisted/application/app.py", line 648, in startApplication service.IService(application).startService() File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-i686.egg/twisted/application/service.py", line 278, in startService service.startService() File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/buildbot-0.8.2_hg_a63f22816750_production_0.8-py2.6.egg/buildbot/master.py", line 567, in startService self.loadTheConfigFile() --- <exception caught here> --- File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/buildbot-0.8.2_hg_a63f22816750_production_0.8-py2.6.egg/buildbot/master.py", line 600, in loadTheConfigFile d = self.loadConfig(f) File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/buildbot-0.8.2_hg_a63f22816750_production_0.8-py2.6.egg/buildbot/master.py", line 624, in loadConfig exec f in localDict File "/builds/buildbot/builder_master1/master.cfg", line 46, in <module> branchObjects = generateBranchObjects(BRANCHES[branch], branch) File "/tools/buildbotcustom/buildbotcustom/misc.py", line 994, in generateBranchObjects **extra_args File "/tools/buildbotcustom/buildbotcustom/process/factory.py", line 1771, in __init__ MercurialBuildFactory.__init__(self, **kwargs) File "/tools/buildbotcustom/buildbotcustom/process/factory.py", line 621, in __init__ assert self.platform in getSupportedPlatforms() exceptions.AssertionError: from sm01, which as far as I can tell has nothing to do with my change.
Comment on attachment 502402 [details] [diff] [review] m620115b-buildbotcustom-r1.patch Ignore this patch - re-uploaded as bug 624298 attachment 502484 [details] [diff] [review]. Would that I could mark it obsolete without uploading a new patch..
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → INCOMPLETE
This should be resolved either as FIXED by another bug or WFM if the reason for filing it disappeared, but not as INCOMPLETE as in fact the bug was complete and acknowledged as it was filed.
Resolution: INCOMPLETE → FIXED
Thanks, from SeaMonkey's point of view this was apparently already fixed per comment #12 after applying the initial patch to its local build system.
I think it works because this particular, incorrect syntax happens to work on sqlite.
Comment on attachment 499293 [details] [diff] [review] m620115-buildbotcustom-r2.patch [applied SM] Let's unobsolete this patch then as it appears to be the only one which was actually checked-in somewhere (though just locally for SeaMonkey). Otherwise it's a bit unclear what was done in this bug.
Attachment #499293 - Attachment description: m620115-buildbotcustom-r2.patch → m620115-buildbotcustom-r2.patch [applied SM]
Attachment #499293 - Attachment is obsolete: false
FYI, since today (i.e. starting from tomorrow's nightlies), the SeaMonkey buildmaster has been updated to buildbotcustom default branch without changes, so now runs what has been checked into the official repo.
(In reply to comment #31) > FYI, since today (i.e. starting from tomorrow's nightlies), the SeaMonkey > buildmaster has been updated to buildbotcustom default branch without changes, > so now runs what has been checked into the official repo. ...and the SeaMonkey nightlies are still appearing dutifully on schedule, on both comm-central and comm-1.9.1. I think baking for a week is long enough => VERIFIED on behalf of SeaMonkey QA.
Status: RESOLVED → VERIFIED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: