Closed
Bug 700855
Opened 13 years ago
Closed 13 years ago
[database] hack created cruft Action data in the live db, can we clean that up?
Categories
(Webtools Graveyard :: Elmo, defect)
Webtools Graveyard
Elmo
Tracking
(Not tracked)
RESOLVED
FIXED
2.1
People
(Reporter: Pike, Assigned: Pike)
Details
Attachments
(3 files)
6.83 KB,
text/x-python-script
|
Details | |
4.13 KB,
patch
|
peterbe
:
review-
|
Details | Diff | Splinter Review |
998 bytes,
patch
|
peterbe
:
review+
|
Details | Diff | Splinter Review |
My local (unreviewed) code to migrate from one release to the next is a piece of shite. Notably, the output looks like:
migrating 46 40
action count 736
migrating 37 14
action count 918
migrating 21 13
action count 994
migrating 14 10
action count 1042
... meaning that, once I understood why the code said what it said, that I'm probably having thousands of Action entries in the db that have absolutely no right to exist. The fix in the attached file for reference is likely
somap = {}
for app_code, app in app4code.iteritems():
vs
for app_code, app in app4code.iteritems():
somap = {}
I don't deserve any help in cleaning this up, but I'd take it anyway. How god-freakin aweful, and how shameful that I haven't looked at what this code does everytime I ran it so far. I did complain to myself about it taking long, of course. Bad Axel, corner, stupid hat.
It'd be really good to have the data cleaned up before I try to do the merging of all that bad data in data1.1, too. Like, duplicate data is one thing, but up to 3 silly dupes, bad.
Comment 1•13 years ago
|
||
I'm still not grasping what's going on due to lack of goals and context.
However, clearly whatever code we're talking about it needs to be folded in nicely and in a structured way so we can document it and unit test it.
I don't know how to actually execute this but I'd love to see it as a management command where the meat of the management command is distinctly separated from the handling of argument options. Then we can write tests that solidify what's going on and we can assert that dupes aren't created.
Honestly don't know where to begin in terms of helping out here.
Comment 2•13 years ago
|
||
For my own help mainly: here's the snippet you sent me to try to explain something about a potentially better way of doing something
http://pastebin.mozilla.org/1378550
Assignee | ||
Comment 3•13 years ago
|
||
This command goes through all Actions, on a 100-ish chunk base, and deletes all obsolete/duplicate actions.
I tested this locally against a db dump.
A word on the chunking:
The cloning does copy over the 'when', so one can be sure that all duplicate actions have the same time. So, in order to not have duplicate actions in different chunks, I chunk by when. To have about the same amount of actions in each chunk, I first run a query to find a good timestamp, and then take all actions within that timeframe.
Comment 4•13 years ago
|
||
Comment on attachment 587290 [details] [diff] [review]
add a command to go through the duplicate actions, and delete all but one
Review of attachment 587290 [details] [diff] [review]:
-----------------------------------------------------------------
r- due to the broken 'quiet' option. Other things are just pyflake nits. Feel free to land once you fix the quiet option.
::: apps/shipping/management/commands/cleanup_actions.py
@@ +37,5 @@
> +'''Update all local clones to the revisions that are shipped with a milestone.
> +'''
> +
> +from collections import defaultdict
> +import datetime
not used
@@ +39,5 @@
> +
> +from collections import defaultdict
> +import datetime
> +from optparse import make_option
> +import os.path
not used
@@ +41,5 @@
> +import datetime
> +from optparse import make_option
> +import os.path
> +
> +from django.core.management.base import BaseCommand, CommandError
CommandError not used
@@ +42,5 @@
> +from optparse import make_option
> +import os.path
> +
> +from django.core.management.base import BaseCommand, CommandError
> +from django.db.models import Max
not used
@@ +46,5 @@
> +from django.db.models import Max
> +from shipping.models import Action
> +
> +class Command(BaseCommand):
> + option_list = BaseCommand.option_list + (
The 'quiet' option disappears when the `option_list` is redefined. I think it should be this::
option_list = BaseCommand.option_list + (
make_option('-q', '--quiet', dest = 'quiet', action = 'store_true',
help = 'Run quietly'),
make_option('-n', '--dry-run', dest = 'dry', action = 'store_true',
help = 'Run quietly'),
)
Attachment #587290 -
Flags: review?(peterbe) → review-
Assignee | ||
Comment 5•13 years ago
|
||
https://github.com/mozilla/elmo/commit/f3495a4bb6aae6da0139ded2f21cfd9289bbe44e, FIXED.
I'll check if it's useful to run remotely, if not, it might want to wait 'til this is on production so that we can run it on bm-l10n-dashboard01
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → 2.1
Assignee | ||
Comment 6•13 years ago
|
||
Takes a good while locally, even without trying to actually remove anything, aka --dry-run.
Might be a good exercise to do this on a fresh port of the db on -dev-, from bm-l10n-dashboard01, too.
Assignee | ||
Comment 7•13 years ago
|
||
Gotta reopen. there's actually a race condition in the code, but that's OK-easy to fix. Basically, if there's more than a 100 actions on the cut time, things run in loops. And on the db state on dev, that triggers. That didn't happen with an earlier clone for me.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 8•13 years ago
|
||
Bustage fix, this patch does two things:
The slice that we're walking over takes both start and end time. That makes us iterate over some actions more than once, but that's OK. It does make sure we're not stalling, though.
The next_cut is moved out a tad further to the past, by excluding the current cutoff.
Both together make the chunk of actions a good deal larger than 100, I've seen as much as 300 in a dry-run.
Gonna request review once I've done a smoke-test on dev on this.
Assignee | ||
Comment 9•13 years ago
|
||
Comment on attachment 589025 [details] [diff] [review]
make next_cut be earlier, and slice more inclusive
This worked on -dev-, cutting us down from 20k to 13k actions, too.
Attachment #589025 -
Flags: review?(peterbe)
Comment 10•13 years ago
|
||
Comment on attachment 589025 [details] [diff] [review]
make next_cut be earlier, and slice more inclusive
Code looks sane. Admittedly, I haven't really grokked what's going on.
Attachment #589025 -
Flags: review?(peterbe) → review+
Assignee | ||
Comment 11•13 years ago
|
||
https://github.com/mozilla/elmo/commit/13cd73be134535993d6380570e36b55b195acc0d, marking FIXED again.
Status: REOPENED → RESOLVED
Closed: 13 years ago → 13 years ago
Resolution: --- → FIXED
Updated•4 years ago
|
Product: Webtools → Webtools Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•