Closed
Bug 850644
Opened 13 years ago
Closed 12 years ago
Create daily cron job to export sumo production db on (new) sumotools
Categories
(Data & BI Services Team :: DB: MySQL, task)
Data & BI Services Team
DB: MySQL
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rrosario, Assigned: scabral)
Details
(Whiteboard: [2013q2] April)
Every so often, we (sumodevs) refresh our local development database with production data. Historically, we have grabbed a copy that is always available on the old sumotools (which is going away soon!).
As soon as we copy and import the database locally, we run an "anonymizer" script to remove email addresses, session data, etc. It would be great if our dump just excluded this data from the start. This is our current anonyizer script:
https://github.com/mozilla/kitsune/blob/master/scripts/anonymize.sql
| Assignee | ||
Comment 1•13 years ago
|
||
Making this a q2 goal, so it happens sooner rather than later.
In the meantime, if you need a production export, we can get you a full one just by copying a file, so do not hesitate to ask for that whenever you need it.
Whiteboard: [2013q2]
Comment 2•13 years ago
|
||
Sheeri,I tried my hand at this since I needed a db dump today. It works, but it's probably horrible, feel free to throw it away and do it properly.
Since there is over 4GB of unnecessary session data in the db I decided to dump the schema first and then dump the tables excluding the ones specified in the anonymize script, that reduced the time for a dump to 5 minutes and the zipped files size to ~800MB.
I couldn't find a way to execute simple SQL queries like in the anonymize script on a dumped db file. At first glance it looks fairly simple to convert them to regexes though, but I didn't attempt that.
I added the following to the crontab, executing every day at 0:15, and a purge of all dumps older than 30 days.
dump_support_mozilla.com.sh:
#structure only
mysqldump -u sumo -pfoxkeh --single-transaction --no-data support_mozilla_com > /data/db_dumps/support_mozilla_com/support_mozilla_com-$(date +%Y%m%d).sql
#data too, but ignoring some tables
mysqldump -u sumo -pfoxkeh --single-transaction support_mozilla_com --ignore-table=support_mozilla_com.tidings_watchfilter --ignore-table=support_mozilla_com.tidings_watch --ignore-table=support_mozilla_com.django_session --ignore-table=support_mozilla_com.messages_inboxmessage --ignore-table=support_mozilla_com.messages_outboxmessage_to --ignore-table=support_mozilla_com.messages_outboxmessage >> /data/db_dumps/support_mozilla_com/support_mozilla_com-$(date +%Y%m%d).sql
#zip
cd /data/db_dumps/support_mozilla_com/
tar -czf support_mozilla_com-$(date +%Y%m%d).sql.tar.gz support_mozilla_com-$(date +%Y%m%d).sql
#remove raw file
rm /data/db_dumps/support_mozilla_com/support_mozilla_com-$(date +%Y%m%d).sql
| Assignee | ||
Comment 3•13 years ago
|
||
Oh, we already have the backups. We also have a way to ignore the tables you're ignoring, if that's what you'd like.
The problem is copying the file over, making sure
Whiteboard: [2013q2] → [2013q2] April
| Assignee | ||
Comment 4•13 years ago
|
||
whoops....
...making sure the firewall is open, getting the timing right so first the export happens, then the copy happens, then the import happens. :D
| Assignee | ||
Updated•13 years ago
|
Assignee: server-ops-database → scabral
| Assignee | ||
Comment 5•13 years ago
|
||
Hrm, your list looks very similar to what we do in:
https://bugzilla.mozilla.org/show_bug.cgi?id=816972#c12
Can we just give you a backup of stage, then? (or can you use stage? not sure what purposes these things are used for)
| Reporter | ||
Comment 6•13 years ago
|
||
(In reply to Sheeri Cabral [:sheeri] from comment #5)
> Can we just give you a backup of stage, then? (or can you use stage? not
> sure what purposes these things are used for)
No, because stage has lots of QA data and will be up to 30 days old. We like our data fresh :)
| Assignee | ||
Comment 7•13 years ago
|
||
Right! because that's only on the 1st of the month. OK, I know what I have to work with now.
Comment 8•12 years ago
|
||
Sheeri, since the production db is already replicated on sumotools we don't need to do any copying from other machines. We just need a dump of the one on sumotools. I just wanted to check if everything is okay with the dumps the cronjob is creating right now, but the server seems to be down :(
| Assignee | ||
Comment 9•12 years ago
|
||
sumotools is up, what's the problem you're having?
| Assignee | ||
Comment 10•12 years ago
|
||
Also - my apologies. I have been working on several bugs which call for an import of a production backup, which is why I was focusing on that. Yes, certainly we can do a daily mysqldump export of that stuff, that's much easier than what I was thinking.
| Assignee | ||
Comment 11•12 years ago
|
||
I worked on this script yesterday, and it ran perfectly overnight:
[root@sumotools1.webapp.phx1 support_mozilla_com]# ls -rlth
total 1.8G
-rw-r--r-- 1 root root 884M Apr 25 13:29 support_mozilla_com.2013.04.25.sql.tar.gz
-rw-r--r-- 1 root root 98 Apr 26 01:52 export.log
-rw-r--r-- 1 root root 885M Apr 26 01:56 support_mozilla_com.2013.04.26.sql.tar.gz
The 4/25 one is one I did manually, the 4/26 was done via cron. I have set it to delete old backups after 7 days.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
| Assignee | ||
Comment 12•12 years ago
|
||
The exports are in:
/data/exports/support_mozilla_com
Updated•11 years ago
|
Product: mozilla.org → Data & BI Services Team
You need to log in
before you can comment on or make changes to this bug.
Description
•