Closed Bug 652191 Opened 14 years ago Closed 14 years ago

Clobberer on MySQL

Categories

(Release Engineering :: General, defect, P2)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

Details

(Whiteboard: [deploying june 22 @ 8am pdt])

Attachments

(1 file)

We need a new clobberer: - a real database backend (MySQL would be OK too) - more friendly UI that doesn't require loading a huge page - more efficient implementation that optimizes number and length of queries and doesn't lock database tables for too long - deployed in a maintainable way (a la slavealloc) I hate PHP with a burning passion, but aside from that I don't see a good reason not to use PHP.
By the way, I figure this is a week's work or so, if devoted to it. It will need some additional resources from IT (a database). This is architecturally very similar to slavealloc: a web service that slaves hit, plus a UI. I don't like using build.mozilla.org for production web services, but I think that's OK for now. The other option is to set up a vhost that the slaves hit directly, and proxy to that vhost from build.mozilla.org. Zandr, any other thoughts on how to properly deploy this?
Feels like it should live on its own VM that we can move to production by clone or fiat when it's ready.
OK, let's prioritize this on Monday and see if we want to make it happen.
Philor: would you be interested in helping with the frontend, since you use it quite often? We're not so good at frontends, although I'm sure we've hidden that weakness well!
Designing things from the ground up isn't really my skill-set - pointing at broken things and saying "that there, right there, that there's broken!" is :) One thing that might let us do Clobberer 1.01 without dropping sqlite is WAL - http://www.sqlite.org/draft/wal.html apparently did wonders for Fx's problems with reads and writes blocking each other, so if our problem is really that while you're loading the web page and reading the whole database, at the same time a hundred slaves are all yelling at once, trying to read big chunks of it and write single lines to it, it might help here too. Would also help if we had a clobberer-staging, or a test file that spawns off a couple dozen processes that try to read and write while you're trying to load the web page: I've always thought my excuse for not working on fixing it was that I didn't have a copy of the db and didn't want to bother creating it by hand, not that even if I did, without the slaves I wouldn't be seeing any problems.
I'd be happy to answer questions regarding SQLite usage. Additionally, we have a support contract with SQLite that we don't use that much, so I can put you in touch with a developer if need be.
Marking as major to take care of short-term solutions. Once that's in place, I'll reduce importance and work on a longer-term solution.
Assignee: nobody → dustin
Severity: normal → major
Priority: -- → P2
WAL mode is not a possibility: [dmitchell@dm-wwwbuild01 db]$ sqlite3 -version 3.3.6 while WAL mode requires 3.7.0 (http://www.sqlite.org/draft/wal.html). In our meeting, catlee suggested migrating the clobberer db to MySQL, which I think is the better solution anyway. I'll request a database (well, two) now, and start moving on that as soon as it's set up.
(In reply to comment #8) > WAL mode is not a possibility: > > [dmitchell@dm-wwwbuild01 db]$ sqlite3 -version > 3.3.6 > > while WAL mode requires 3.7.0 (http://www.sqlite.org/draft/wal.html). I presume that means that upgrading SQLite is not an option? That's sad since a lot of query plan optimizations have made it in since 3.3.6 (and SQLite is generally much faster).
Database setup requested in bug 652691.
It's a shared host so who knows what other mayhem would be wrought by upgrading sqlite. At any rate, IMHO sqlite isn't an appropriate choice for a production application like this.
MySQL access is almost complete (bug 652691 is with netops), and MySQL will soon be installed on dm-wwwbuild01 (bug 663279).
This is running in staging-clobberer, against the MySQL database. It ain't fast, but hopefully it won't exhibit the same blocking behavior, either?
Attachment #539433 - Flags: review?(bhearsum)
Comment on attachment 539433 [details] [diff] [review] m652191-tools-p1-r1.patch Looks good to me, and I was successfully using staging-clobberer after you put it on MySQL yesterday.
Attachment #539433 - Flags: review?(bhearsum) → review+
What is the process to upgrade the production clobberer? Treeclosure?
How long will it be out for? The MozillaClobberer step is flunkOnFailure=False, so if the outage will be quick, I say just go for it.
Good, then. I'll mark for needs-treeclosure in case we do a downtime next week, but I'll reserve the right to Just Do It if that doesn't pan out.
Flags: needs-treeclosure?
The long part of the process here is the copy from sqlite to mysql. I'll script up that copy and dry-run it a few times to get an idea of the time. Assuming the time is <10m, I'll then plan a downtime window, and during that window disable index.php, do the copy, and enable the new index.php. I'll probably just announce that window here and in #build, and make sure jhford (buildduty) is aware, rather than making a newsgroups post, unless someone suggests otherwise. The backout procedure is simple (replace index.php with the old version).
14 seconds. That ain't bad! [dmitchell@dm-wwwbuild01 ~]$ time ./clobberer-migrate.sh dumping loading Enter password: COUNT(*) 3995 COUNT(*) 9050 real 0m14.833s user 0m0.395s sys 0m0.306s [dmitchell@dm-wwwbuild01 ~]$ cat clobberer-migrate.sh #! /bin/sh SQLITE_DB=/<redacted>/clobberer.db MYSQL_HOST=<redacted> MYSQL_DB=<redacted> MYSQL_USER=<redacted> cd /tmp echo "dumping" sqlite3 "$SQLITE_DB" .dump \ | grep -v TRANSACTION \ | grep -v sqlite \ | sed -e's/"//g' \ | sed -e 's/AUTOINCREMENT/AUTO_INCREMENT/g' > clobberer_dump.sql echo "loading" ( echo 'DROP TABLE if exists clobber_times;' echo 'DROP TABLE if exists builds;' cat clobberer_dump.sql echo 'SELECT COUNT(*) from clobber_times;' echo 'SELECT COUNT(*) from builds;' ) | mysql -u "$MYSQL_USER" -h "$MYSQL_HOST" -p -D "$MYSQL_DB" I'll keep with the plan of riding along on a downtime next week, since it's happening anyway.
Summary: Clobberer 2.0 → Clobberer on MySQL
Scheduled for tomorrow: ---- We will be closing all Firefox trees on Wednesday June 22, 2011 at 8:00 AM PDT for three hours for maintenance. The work items are: -Bug 662071 - Changes to the SCL <-> SJC link to improve networking between Releng machines -Bug 652191 - Convert clobberer to use MySQL instead of Sqlite as a data store. This should help make the clobberer less prone to failure. If you know of a reason why this tree closure should not occur at the time above please let me know as soon as possible. We will confirm that all releases are in good shape before commencing this work. While we aim to minimize interruption, there is a possibility that builds may be interrupted. We will attempt to retrigger failing builds as soon as possible. Thanks, John Ford
Flags: needs-treeclosure? → needs-treeclosure+
Whiteboard: [deploying june 22 @ 8am pdt]
Comment on attachment 539433 [details] [diff] [review] m652191-tools-p1-r1.patch Landed and deployed using the migration script above. Clobberer seems to be loading just fine again.
I successfully clobbered a few builds. Assuming the slaves are OK with it, things look good!
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: