Closed Bug 650816 (data1.1) Opened 13 years ago Closed 12 years ago

[shipping] Make signoffs work for the rapid release cycle

Categories

(Webtools Graveyard :: Elmo, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Pike, Assigned: Pike)

References

Details

(Whiteboard: [data1.1])

Attachments

(5 files, 2 obsolete files)

Proposal:

- make Signoff link to Tree instead of AppVersion
- make mapping between Tree and AppVersion time-dependent
- Actions need to know which Tree they're acting on
-- for a sign-off before branch point, we need to take actions independently for each of the two post-point branches
-- aka obsoleting a signoff on aurora shouldn't affect beta
- de-normalize Signoff/Action by keeping latest status in an extra model?

- amend Tree and AppVersion with a "earliest good signoff" to reduce the query span for signoffs?
Ascii art scribbles:

----------#--------------#--------#-------------- (aurora)
 \           \           \           \           \
   -------*-   ----?-----  --*-------  -------*-- (beta)

Event (merge aurora->beta)
AppVersion (5.0, 6.0)
Tree (mozilla-central, mozilla-aurora)
Signoff -> AppVersion (current)
Signoff -> Tree (proposed)

          |-------|
-----#--#-#--------------#--------#-------------- (aurora)
     +  - +- o    -      +-       +               (Actions)
 \           \           \           \           \
   -------*-   ----------  --*-------  -------*-- (beta)
          +-+                +                -+  (Actions)
Whiteboard: [data1.1]
Target Milestone: --- → 1.0
Target Milestone: 1.0 → ---
Target Milestone: --- → 1.2
Blocks: 530808
Blocks: 542489
I'm on this, it's on https://github.com/Pike/elmo/tree/feature/data1.1.

I'll attach the first two patches for feedback, as I'm starting with, oops, writing tests for what we have.
Assignee: nobody → l10n
Simple patch to just add some natural keys stuff. This makes writing (and reading) fixtures so much easier. I didn't do this across the board, but just focused on the models that were relevant to the tests that are up next.
Attachment #529359 - Flags: review?(peterbe)
Can't hurt to have tests to start with, I guess :-)

Also, I wanted to create some more-or-less simple dataset that does something useful, to hack the migration code on top.
Attachment #529360 - Flags: review?(peterbe)
Attachment #529360 - Flags: review?(gandalf)
Comment on attachment 529360 [details] [diff] [review]
simple tests for signoffs status quo

Review of attachment 529360 [details] [diff] [review]:

Perhaps I'm not seeing the bigger picture but if the fixtures are only used by a single test (or even a single test CASE) then I would gently suggest we add the fixture as Python code. 

JSON is great but Python is better. All data and code in one place, easier to comment stuff in and out and just as fast.

::: apps/shipping/tests.py
@@ +40,5 @@
+
+import test_utils
+import os
+import datetime
+from django.test import TestCase

redundant import

@@ +54,5 @@
+    def setUp(self):
+        self.av = AppVersion.objects.get(code="fx1.0")
+    def test_count(self):
+        """Test that we have the right amount of Signoffs and Actions"""
+        self.assertEqual(Signoff.objects.count(), 2)

Since test_utils is based on nose we can use the more succinct nose utils like:

from nose.tools import eq_, ok_
eq_(Signoff.objects.count(), 2)
Attachment #529360 - Flags: review?(peterbe) → review+
There's just one test so far, but I was hoping we'd write more, which is why I factored the fixtures into separate chunks in terms of changesets, pushes, and signoffs.

I assume that you mean that we'd have a host of python code inside the setUp() method of the test instead of adding a fixtures property?
RFC: I want to use the code in https://github.com/Pike/django-durationRel to factor the time-dependent mappings, in this step, the one between Tree and AppVersion. What do you think about that code? More tests set aside.
(In reply to comment #6)
> There's just one test so far, but I was hoping we'd write more, which is why I
> factored the fixtures into separate chunks in terms of changesets, pushes, and
> signoffs.
> 
I see.

> I assume that you mean that we'd have a host of python code inside the setUp()
> method of the test instead of adding a fixtures property?

Yes. If that's appropriate. It's not appropriate if the fixture is shared amongst multiple test cases.

I noticed you did that here for example: https://github.com/Pike/django-durationRel/blob/master/django_durationRel/tests/tests.py
Comment on attachment 529360 [details] [diff] [review]
simple tests for signoffs status quo

r+
Attachment #529360 - Flags: review?(gandalf) → review+
I'd like to see moar tests around this. We should especially verify all the combinations of rejected/accepted/pending
Blocks: 655943
Blocks: 658656
Alias: data1.1
This is an updated patch, tests and model changes in one.

This is also a good deal different, as there are existing tests to merge with now.

I'd like to land this one separately from the rest, too.
Attachment #529359 - Attachment is obsolete: true
Attachment #529360 - Attachment is obsolete: true
Attachment #529359 - Flags: review?(peterbe)
Attachment #535016 - Flags: review?(peterbe)
Comment on attachment 535016 [details] [diff] [review]
one patch, unbitrotted (landed)

Review of attachment 535016 [details] [diff] [review]:
-----------------------------------------------------------------

You might want to have a check of the coverage. Something like this might work:

./manage.py test --with-coverage --cover-package=shipping --cover-html --cover-erase --cover-html-dir=coverage shipping

Otherwise, I quite like the natural keys. Great work!

::: apps/shipping/tests.py
@@ +245,5 @@
>          self.assertEqual(milestone.status, 2)
> +
> +
> +class SignOffTest(test_utils.TestCase):
> +    """Test Signoffs.

I'd personally rather see a "real" doc string or just leave it blank which is fine with me. The name of the test case tells me what it does.

@@ +250,5 @@
> +    """
> +    fixtures = ["test_repos.json", "test_pushes.json", "signoffs.json"]
> +    def setUp(self):
> +        self.av = AppVersion.objects.get(code="fx1.0")
> +    def test_count(self):

I'd love to see some PEP8 on this. The lack of newlines between methods makes it very dense.

@@ +252,5 @@
> +    def setUp(self):
> +        self.av = AppVersion.objects.get(code="fx1.0")
> +    def test_count(self):
> +        """Test that we have the right amount of Signoffs and Actions"""
> +        self.assertEqual(Signoff.objects.count(), 2)

As per the "playdoh way" we're using nose and can therefore use the much more succinct ok_ and eq_:

    from nose.tools import eq_, ok_
Attachment #535016 - Flags: review?(peterbe) → review+
Comment on attachment 535016 [details] [diff] [review]
one patch, unbitrotted (landed)

landed with comments addressed on develop, https://github.com/mozilla/elmo/commit/d862591e7de050888d9bae9b84a4114a378e4845.
Attachment #535016 - Attachment description: one patch, unbitrotted → one patch, unbitrotted (landed)
Target Milestone: 1.2 → 1.3
I've started to upload a branch on https://github.com/Pike/elmo/commits/feature%2Fbug-650816-rewrite-api.

There's quite a few steps to do:

- replace existing _signoffs logic (walking forward in Signoffs) with a logic walking back through Actions
-- add api (done)
-- make views use new api (we're here, *)
-- remove old code

- migrate relation of trees-appversion to be time-based
- migrate signoffs to be on tree, and actions to be optionally constrained to appversion

(*) make views use the new api:

I'm going to be bold and fork __init__.py for this. I've copied it over to status.py so that the diffs make sense.

Also, the handling of "appver_or_milestone" has been somewhat spread between the view code and the api code (_signoffs), I'll move that completely into the view code, and make the code reuse of the appver-or-milestone logic more powerful by using a class-based view logic in status.py.

Comments on the existing patches, approach, etc? here or on github notes.
I'd like to land another step in rewriting data1.1, which is moving over l10n-changesets, shipped-locales and the api/signoffs views over to the new api.

This is the all-in-one patch for that change, though I'd actually prefer to land the individual patches as they're on https://github.com/Pike/elmo/commits/feature%2Fbug-650816-rewrite-api right now.
Attachment #546521 - Flags: review?(peterbe)
Comment on attachment 546521 [details] [diff] [review]
all-in-one patch for my feature branch

Review of attachment 546521 [details] [diff] [review]:
-----------------------------------------------------------------

Looks great with the new class based views! That's really exciting. 

I mainly picked on stylistic pythonic issues but the one about mutable keyword arguments is bad since it can lead to memory leaks.

::: apps/shipping/api.py
@@ +44,5 @@
> +from life.models import Locale
> +from shipping.models import AppVersion, Action
> +
> +
> +def signoff_actions(locales={}, appversions={}, related=[], chunks=100):

Having arguments being mutables is a bad idea. It can leak memory. If you like I can try to dig up some literature to "prove" this. The "correct" pattern when you need something like this is to either use `None` or a marker. The "None pattern":

    def foo(names=None, versions=None):
        if names is None:
            names = []
        if versions is None:
            versions = {}

The "marker pattern":

    _marker = object()
    def foo(names=_marker, versions=_marker):
        if names is _marker:
            names = []
        if versions is _marker:
            versions = {}

I prefer the None pattern because I think it's more Pythonic. Also, it stinks if you have calling functions that pass None deliberately (e.g. `run_report(username=None)`). The "marker pattern" is more powerful but looks less pleasing. Especially if you're new to Python and wonder what it is.

@@ +47,5 @@
> +
> +def signoff_actions(locales={}, appversions={}, related=[], chunks=100):
> +    # go over all appversions
> +    appversions = AppVersion.objects.filter(**appversions)
> +    for appversion in appversions.all().select_related("tree"):

why using `.all()` here after we've just done a `.filter()`?

@@ +54,5 @@
> +        # XXX data2: reduce to active locales for the appversion
> +        active_locs = locales_q
> +        # included locales, map locale id to signoff ids we already had
> +        inc_locales = dict((_id, set())
> +                           for _id in active_locs.values_list('id', flat=True))

I would personally prefer the collections.defaultdict for this:

    from collections import defaultdict
    ...

    inc_locales = defaultdict(set)
    ...
    inc_locales[loc_id].add(something)

More "functional looking" :)

@@ +63,5 @@
> +        actions = actions.order_by('-id')
> +        actions = actions.values_list('id',
> +                                      'flag',
> +                                      'signoff_id',
> +                                      'signoff__locale_id')

I would prefer this much more succinct syntax:

    actions = (Action.objects.
               .filter(signoff__appversion=appversion)
               .order_by('-id')
               .values_list('id',
               'flag',
               'signoff_id',
               'signoff__locale_id'))

@@ +67,5 @@
> +                                      'signoff__locale_id')
> +        latest = None
> +        while inc_locales:
> +            this_actions = \
> +                         actions.filter(signoff__locale__in=inc_locales.keys())

Completely nitpicking but this variable should technically be "these_actions" as it's an array-like queryset.

@@ +101,5 @@
> +            if not had_action:
> +                break
> +
> +
> +def flag_lists(locales={}, appversions={}, chunks=100):

Again, try to avoid mutables as keyword arguments.

@@ +111,5 @@
> +    for tree, loc, f in actions.values_list('signoff__appversion__tree__code',
> +                                            'signoff__locale__code',
> +                                            'flag'):
> +        flags[tree, loc].add(f)
> +    return dict((k, list(v)) for k, v in flags.iteritems())

The whole point of this line is to turn the values (values of 'flag') into lists. Why? Cries out for a comment to explain the reasoning.

::: apps/shipping/urls.py
@@ -56,3 +53,5 @@
> >      (r'^\/ship$', 'ship_mstone'),
> >  )
> >  
> > +urlpatterns += patterns('shipping.views.status',
> > +    (r'^\/l10n-changesets$', 'l10n_changesets'),

It's not important but the '/' character doesn't need to be escaped.

::: apps/shipping/views/status.py
@@ +94,5 @@
> +        self.process_request(request)
> +        data = self.get_data()
> +        content = self.content(request, *data)
> +        r = HttpResponse(content, content_type='text/plain; charset=utf-8')
> +        if self.filename is not None:

+   `if self.filename:` because if it was blank it would make for a breaking header.

@@ +95,5 @@
> +        data = self.get_data()
> +        content = self.content(request, *data)
> +        r = HttpResponse(content, content_type='text/plain; charset=utf-8')
> +        if self.filename is not None:
> +            r['Content-Disposition'] = 'inline; filename=%s' % self.filename

+   `'inline; filename="%s"'` in case this filename was ever to have space in it it won't break so horribly.

@@ +111,5 @@
> +        return ('%s %s\n' % (l, revmap[tips[l]][:12])
> +                for l in sorted(tips.keys()))
> +
> +
> +l10n_changesets = cache_control(max_age=60)(Changesets.as_view())

Why save this variable ('l10n_changesets')? It's not used within this file for anything. 
Either don't bother with a variable assignment or document with a comment why it's important.

@@ +172,5 @@
> +        for k, sol in lsd.iteritems():
> +            # ignore tree/locale combos which have no active tree no more
> +            if k[0] is None:
> +                continue
> +            items[k] = [values[so] for so in sol]

Since you're not using the value every time, let Python use its generators itself (if possible). Consider this re-write:

   for k in lsd:
     # ignore tree/locale combos which have no active tree no more
     if not k[0]:
         continue
     items[k] = [values[so] for so in lsd[k]]

@@ +177,5 @@
> +        # get shipped-in data, latest milestone of all appversions for now
> +        shipped_in = defaultdict(list)
> +        for _av in appvers:
> +            try:
> +                _ms = _av.milestone_set.filter(status=2).order_by('-pk')[0]

A bit out of my depth here but if you're only ever going to pick 1 item from the query, would it help to first slice it to the 1st item?
Ie.

    for _ms in _av.milestone_set.filter(status=2).order_by('-pk')[:1]:
        break
    else:
        continue

That would send the `LIMIT 1 OFFSET 0` to the db and python would ultimately only have to convert 1 record into a python model class install.
Attachment #546521 - Flags: review?(peterbe) → review-
(In reply to comment #16)

I'll digest the other comments while addressing them, just for ...

> > +l10n_changesets = cache_control(max_age=60)(Changesets.as_view())
> 
> Why save this variable ('l10n_changesets')? It's not used within this file
> for anything. 
> Either don't bother with a variable assignment or document with a comment
> why it's important.

class-based views and cache decorators don't go well. You either need to put the decorators into urls.py (which I find ugh), or you need to use the function way of using them in views, and store the result to refer to from urls.py.

Thus, urls.py refers to status.l10n_changesets, which is the decorated version of ClassBasedView.as_view(), which is a class method that in itself creates an instance and then dispatches.

Not sure how to mark that up in a constructive way. How about an

# Expose the decorated views
__all__ = ['l10n_changesets', 'shipped_locales', 'signoff_json']

?
Blocks: 672545
(In reply to comment #17)
> (In reply to comment #16)
> 
> I'll digest the other comments while addressing them, just for ...
> 
> > > +l10n_changesets = cache_control(max_age=60)(Changesets.as_view())

per irc, this is going into a follow-up bug to get a class-view decorator, filed bug 672545.
More comments. I'm having three fixup-changesets on my feature branch, which I'll rebase into the api and the status patch after review, just because they're only cool for review, I think.

This chunk is mostly addressed in https://github.com/Pike/elmo/commit/fdbb938123a3d781156a244576f7cc7ba567db47.

(In reply to comment #16)
> Comment on attachment 546521 [details] [diff] [review] [review]
> all-in-one patch for my feature branch
> 
> Review of attachment 546521 [details] [diff] [review] [review]:
> -----------------------------------------------------------------
> 
> Looks great with the new class based views! That's really exciting. 
> 
> I mainly picked on stylistic pythonic issues but the one about mutable
> keyword arguments is bad since it can lead to memory leaks.
> 
> ::: apps/shipping/api.py
> @@ +44,5 @@
> > +from life.models import Locale
> > +from shipping.models import AppVersion, Action
> > +
> > +
> > +def signoff_actions(locales={}, appversions={}, related=[], chunks=100):
> 
> Having arguments being mutables is a bad idea. It can leak memory. If you
> like I can try to dig up some literature to "prove" this. The "correct"
> pattern when you need something like this is to either use `None` or a
> marker. The "None pattern":
> 
>     def foo(names=None, versions=None):
>         if names is None:
>             names = []
>         if versions is None:
>             versions = {}
> 
> The "marker pattern":
> 
>     _marker = object()
>     def foo(names=_marker, versions=_marker):
>         if names is _marker:
>             names = []
>         if versions is _marker:
>             versions = {}
> 
> I prefer the None pattern because I think it's more Pythonic. Also, it
> stinks if you have calling functions that pass None deliberately (e.g.
> `run_report(username=None)`). The "marker pattern" is more powerful but
> looks less pleasing. Especially if you're new to Python and wonder what it
> is.

I went for None both here and below.

> @@ +47,5 @@
> > +
> > +def signoff_actions(locales={}, appversions={}, related=[], chunks=100):
> > +    # go over all appversions
> > +    appversions = AppVersion.objects.filter(**appversions)
> > +    for appversion in appversions.all().select_related("tree"):
> 
> why using `.all()` here after we've just done a `.filter()`?

fixed.

> @@ +54,5 @@
> > +        # XXX data2: reduce to active locales for the appversion
> > +        active_locs = locales_q
> > +        # included locales, map locale id to signoff ids we already had
> > +        inc_locales = dict((_id, set())
> > +                           for _id in active_locs.values_list('id', flat=True))
> 
> I would personally prefer the collections.defaultdict for this:
> 
>     from collections import defaultdict
>     ...
> 
>     inc_locales = defaultdict(set)
>     ...
>     inc_locales[loc_id].add(something)
> 
> More "functional looking" :)

I'm actually using the dict for both keeping track of which locales to look at and for what data I have found for them. So starting off with all locales and an empty dataset is intentional. I've landed a better comment, as the one that was there was really confusing or just wrong. That's in 
https://github.com/Pike/elmo/commit/8a13bf64917cdfa8ac72b328a4be4b9c09e58e3e, also to be folded into the api patch.

> @@ +63,5 @@
> > +        actions = actions.order_by('-id')
> > +        actions = actions.values_list('id',
> > +                                      'flag',
> > +                                      'signoff_id',
> > +                                      'signoff__locale_id')
> 
> I would prefer this much more succinct syntax:
> 
>     actions = (Action.objects.
>                .filter(signoff__appversion=appversion)
>                .order_by('-id')
>                .values_list('id',
>                'flag',
>                'signoff_id',
>                'signoff__locale_id'))

done. I tried something like that and couldn't get it to work, thanks for the syntax help.

> @@ +67,5 @@
> > +                                      'signoff__locale_id')
> > +        latest = None
> > +        while inc_locales:
> > +            this_actions = \
> > +                         actions.filter(signoff__locale__in=inc_locales.keys())
> 
> Completely nitpicking but this variable should technically be
> "these_actions" as it's an array-like queryset.

Renamed that to actions_chunk, which is really what it is. The resulting line is too long either way, no good idea how to make it readable and less than 80 chars, so I went for readable 82 chars.

> @@ +101,5 @@
> > +            if not had_action:
> > +                break
> > +
> > +
> > +def flag_lists(locales={}, appversions={}, chunks=100):
> 
> Again, try to avoid mutables as keyword arguments.

Done.

> @@ +111,5 @@
> > +    for tree, loc, f in actions.values_list('signoff__appversion__tree__code',
> > +                                            'signoff__locale__code',
> > +                                            'flag'):
> > +        flags[tree, loc].add(f)
> > +    return dict((k, list(v)) for k, v in flags.iteritems())
> 
> The whole point of this line is to turn the values (values of 'flag') into
> lists. Why? Cries out for a comment to explain the reasoning.

Lists were backwards compat, the tests wanted them, so I used a list, too. I reverted that code to be closer to the original, using a set() was probably over-engineered anyway, given that the list can really only be 4 items long or so. The new code keeps the order of the old api.
The next chunk is https://github.com/Pike/elmo/commit/8993502c46d16f37c5f7d648304167af3284a8d4.

This one is a tad beefier, as I had to fix some regressions that I hadn't spotted before.

In particular:

https://github.com/Pike/elmo/commit/8993502c46d16f37c5f7d648304167af3284a8d4#L1L145 fixes the appvers being wrong. They're supposed to be all appvers for the apps for apps that match av_or_m.

(In reply to comment #16)
<...>
> ::: apps/shipping/urls.py
> @@ -56,3 +53,5 @@
> > >      (r'^\/ship$', 'ship_mstone'),
> > >  )
> > >  
> > > +urlpatterns += patterns('shipping.views.status',
> > > +    (r'^\/l10n-changesets$', 'l10n_changesets'),
> 
> It's not important but the '/' character doesn't need to be escaped.

Done.

> ::: apps/shipping/views/status.py
> @@ +94,5 @@
> > +        self.process_request(request)
> > +        data = self.get_data()
> > +        content = self.content(request, *data)
> > +        r = HttpResponse(content, content_type='text/plain; charset=utf-8')
> > +        if self.filename is not None:
> 
> +   `if self.filename:` because if it was blank it would make for a breaking
> header.
> 
> @@ +95,5 @@
> > +        data = self.get_data()
> > +        content = self.content(request, *data)
> > +        r = HttpResponse(content, content_type='text/plain; charset=utf-8')
> > +        if self.filename is not None:
> > +            r['Content-Disposition'] = 'inline; filename=%s' % self.filename
> 
> +   `'inline; filename="%s"'` in case this filename was ever to have space
> in it it won't break so horribly.

Both done.

> @@ +111,5 @@
> > +        return ('%s %s\n' % (l, revmap[tips[l]][:12])
> > +                for l in sorted(tips.keys()))
> > +
> > +
> > +l10n_changesets = cache_control(max_age=60)(Changesets.as_view())
> 
> Why save this variable ('l10n_changesets')? It's not used within this file
> for anything. 
> Either don't bother with a variable assignment or document with a comment
> why it's important.

See the bug I just filed as follow-up.

> @@ +172,5 @@
> > +        for k, sol in lsd.iteritems():
> > +            # ignore tree/locale combos which have no active tree no more
> > +            if k[0] is None:
> > +                continue
> > +            items[k] = [values[so] for so in sol]
> 
> Since you're not using the value every time, let Python use its generators
> itself (if possible). Consider this re-write:
> 
>    for k in lsd:
>      # ignore tree/locale combos which have no active tree no more
>      if not k[0]:
>          continue
>      items[k] = [values[so] for so in lsd[k]]
> 
> @@ +177,5 @@
> > +        # get shipped-in data, latest milestone of all appversions for now
> > +        shipped_in = defaultdict(list)
> > +        for _av in appvers:
> > +            try:
> > +                _ms = _av.milestone_set.filter(status=2).order_by('-pk')[0]
> 
> A bit out of my depth here but if you're only ever going to pick 1 item from
> the query, would it help to first slice it to the 1st item?
> Ie.
> 
>     for _ms in _av.milestone_set.filter(status=2).order_by('-pk')[:1]:
>         break
>     else:
>         continue
> 
> That would send the `LIMIT 1 OFFSET 0` to the db and python would ultimately
> only have to convert 1 record into a python model class install.

Oy, is [0] stupid. Thanks for the pointer. I also added a .select_related('app') in this fragment to cut off some lag.
PS: The given_app logic was completely foobar, too. That's also in https://github.com/Pike/elmo/commit/8993502c46d16f37c5f7d648304167af3284a8d4#L1L145. It now checks that it's really just hit for a single app, before it was falsly just taking the first appname to come around. That was breaking for queries that span applications.

Background: given_app is used for legacy appversions like fx5 right now, which don't have a real mapping to a tree, and thus I'm taking a shortcut from appversion to app, instead of going through the tree and back, like I do for active apps. That works because app-spanning queries only hit active apps at this point. Fingers crossed.
(In reply to comment #21)
> PS: The given_app logic was completely foobar, too. That's also in
> https://github.com/Pike/elmo/commit/
> 8993502c46d16f37c5f7d648304167af3284a8d4#L1L145. It now checks that it's
> really just hit for a single app, before it was falsly just taking the first
> appname to come around. That was breaking for queries that span applications.
> 
r+ on that code.
There's room for improvement with turning QuerySet `apps` into a SQL COUNT instead of relying on `len(queryset)`.

> Background: given_app is used for legacy appversions like fx5 right now,
> which don't have a real mapping to a tree, and thus I'm taking a shortcut
> from appversion to app, instead of going through the tree and back, like I
> do for active apps. That works because app-spanning queries only hit active
> apps at this point. Fingers crossed.
next step landed on develop, https://github.com/mozilla/elmo/commit/473b94ccd02740561923094e442d14a5dbe74ab4.

also, the len(apps) is not a len(queryset) but a len(list), so I'm hitting the db only once.

A ton more stuff still to do, so leaving open.
This is the next iteration in this patch sequence, use the api throughout, and remove _signoffs.

I added a bit more testing on the shipping logic.

I decided to remove "clear milestone", that's logic that was way-back-when bound, removing all locales from an appversion. If we'd remove, aka obsolete, something, we'd do that on a per-locale basis today.
I also removed the l10n-changesets-json command, that's replaced with the web interface now.
Attachment #548450 - Flags: review?(peterbe)
Comment on attachment 548450 [details] [diff] [review]
remove _signoffs, use api instead

Review of attachment 548450 [details] [diff] [review]:
-----------------------------------------------------------------

Several things I don't fully understand but nothing but little nits in the code.

::: apps/shipping/management/commands/merge_signoffs.py
@@ +42,4 @@
>  
>  from django.core.management.base import BaseCommand, CommandError
>  from shipping.models import AppVersion
> +from shipping.api import accepted_signoffs

This is much better!

@@ +46,1 @@
>  import pdb

Debugging left in

@@ +59,3 @@
>          target = AppVersion.objects.get(code=args[1])
>          if fork.tree.l10n != target.tree.l10n:
>              raise CommandError, "Fork and target appversion don't share l10n"

Please use the more PEP8 friendly format of:

  raise CommandError("Fork ... ")

@@ +64,2 @@
>          known_push_ids = dict(tsos.values_list('locale__code','push__id'))
>          sos = fsos.exclude(push__id__in=known_push_ids.values())

Perhaps I haven't understood the whole context but here it looks like it ends on a query. Does that do anything with the data?

::: apps/shipping/management/commands/transplant_signoffs.py
@@ +56,3 @@
>          new = AppVersion.objects.get(code=args[1])
>          if old.tree.l10n != new.tree.l10n:
>              raise CommandError, "Old and new appversion don't share l10n"

Again, please use `CommandError("...`

::: apps/shipping/tests.py
@@ +439,4 @@
>          eq_(so['apploc'], 'fx::pl')
>          eq_(so['tree'], 'fx')
>  
> +    def test_ship_milestone(self):

Excellent!

::: apps/shipping/views/milestone.py
@@ +219,5 @@
> +        extra_plats = defaultdict(list)
> +        try:
> +            from mercurial.hg import repository
> +            from mercurial.ui import ui as _ui
> +        except:

Perhaps this isn't part of the patch but why would an ImportError be allowed here?

::: apps/shipping/views/signoff.py
@@ +133,5 @@
> +                                   locales={'id': lang.id}))
> +    actions = list(Action.objects.filter(id__in=actions)
> +                   .select_related('signoff__push', 'author'))
> +    current_so = currentpush = None
> +    actions4push = defaultdict(list)

I like the variable name!

@@ +147,4 @@
>  
>      # get pushes, changesets and signoffs/actions
>      _p = list(pushes_q.values_list('id',flat=True))
>      pcs = Push_Changesets.objects.filter(push__in=_p).order_by('-push__push_date','-changeset__id')

>80 chars. How about just:

    pcs = (Push_Changesets.objects
           .filter(push__in=_p)
           .order_by('-push__push_date','-changeset__id'))
Attachment #548450 - Flags: review?(peterbe) → review+
I've glanced at the output of check.py while doing this patch, and there's no road to happiness there but a distinct bug. Whenever you fix something that's close, the next bad line comes into context. There's so much wrong in the files I touched that I had a hard time finding out if I regressed anything. I filed a tracker bug 674526 to do this, probably per-app is getting decently sized patches.

Thus, I landed the patch as is, https://github.com/mozilla/elmo/commit/dab4ca9a8c714466ddd3cdf1bf04cc8227962a2c.

So much for the patches to prepare for the actual changes, yikes.
Next phase, actually do a model migration:

Moving the one-to-many relation between tree and appversion to a time-based relation, using django-durationRel. This patch goes along with one to elmo-lib that adds https://github.com/Pike/django-durationRel to src and vendor.pth.

appversion.tree becomes either appversion.get_current_tree() or .get_latest_tree(). get_current_tree() is a query set, and can be one or zero elements. In theory, it can be many, but that shouldn't happen in practice. get_latest_tree() returns the last tree object, sorted by startdate. This should really ever be there, and be only one. Sorting by startdate get's None right, aka, trees that are forever bound to one appver (old style).

There are changes to how the diff's in signoff work now, as that's now spanning multiple repos. In our scenarios, the "to" repo is beta, and the "from" repo could be aurora, and picking the beta repo should have the aurora changesets. Thus, that. I'm dropping the tree and url arguments, they don't impact the view anyway.

In the signoff view, the pushes can be on different repos, if the shown list of pushes spans the migration date. Thus, I'm keeping track of the forest in the pushes array. I'm also making a visual change: I introduce a seperator td between pushes. That's been kinda asked for since the beginning, and I make that separator larger when the pushes go from one repo to the other. That's not cool UX, but at least there's some visual indication for the merges.

The queries that used to go through 'tree' quite commonly now start with 

  AppVersion.tree.through.current

which is the custom manager on the created through model, restricting the relation to now. There's no corresponding manager to _latest_, fwiw.

Review requests on both stas and peterbe, stas mostly to check that the business logic change makes sense.
Attachment #551523 - Flags: review?(stas)
Attachment #551523 - Flags: review?(peterbe)
Blocks: 677507
Blocks: 679321
Comment on attachment 551523 [details] [diff] [review]
Use DurationRelField to map Tree to AppVersion

Review of attachment 551523 [details] [diff] [review]:
-----------------------------------------------------------------

Some niggly nits

::: apps/shipping/views/outreach.py
@@ +105,5 @@
> +    avts = (AppVersion.tree.through.current
> +            .filter(appversion__app__in=name4app.keys(),
> +                    tree__l10n__in=[f_aurora, f_beta])
> +            .order_by('appversion__code')
> +            .select_related('appversion__app', 'tree'))

Isn't that select_related going to cause quite a complex and potentially slow SQL query? Complex joins can be ultra slow if any of the tables are huge. Are they? (or can/will they be?)

@@ +136,5 @@
>      matrix = dict()
>      for signoff in old_signoffs:
>          if signoff.locale_id not in matrix:
> +            matrix[signoff.locale_id] = [None] * len(avts)
> +        for av_i, avt in enumerate(avts):

Why enumerate here? I don't think it's a good idea rely on `av_i` being a set variable after and outside this for loop block. If you want to know what index `signoff.appversion_id` was in the list of appversion trees, then use something like `[x.appversion_id for x in avts].index(signoff.appversion_id)`

::: apps/shipping/views/signoff.py
@@ +127,5 @@
>                                      signoff__appversion=av).order_by('-pk')
>      # pushes and runs only matter if there's still a tree associated
> +    # just check the current tree
> +    try:
> +        tree_id = av.get_current_tree().values_list('id', flat=True)[:1][0]

This is not needed. (did some research and testing)
If you do:
 
    item = Model.objects.find(...)[0]

it actually does:

    SELECT .... LIMIT 1

@@ +130,5 @@
> +    try:
> +        tree_id = av.get_current_tree().values_list('id', flat=True)[:1][0]
> +    except IndexError:
> +        tree_id = None
> +    if tree_id is not None:

Just do `if tree_id:`? Less noise.

@@ +167,5 @@
> +    tree4forest = {}
> +    for av_tree in av_trees:
> +        tree4forest[av_tree.tree.l10n_id] = av_tree.tree_id
> +        q_d = {'repository__forest__tree': av_tree.tree_id}
> +        if av_tree.startdate is not None:

`if av_tree.startdate:` will be the same thing and is more succinct.

@@ +254,5 @@
>      for p in pushes:
> +        if (previous_forest is not None
> +            and previous_forest != p['forest_id']):
> +            p['new_forest'] = True
> +            print "new push"

left over print statement?

@@ +263,5 @@
>                  sod['diffbases'] = 1
>          # runs
> +        if p['forest_id'] in c2r and p['tip_id'] in c2r[p['forest_id']]:
> +                # we stored a run for the tip of this push
> +                _r = c2r[p['forest_id']][p['tip_id']]

why this short and odd variable name? Was it used for print debugging before?

@@ +410,4 @@
>          try:
>              lang = Locale.objects.get(code=locale_code)
>              appver = AppVersion.objects.select_related('tree').get(code=app_code)
> +            tree = appver.get_current_tree()[0]

Come to think of it, this is rather "ugly". It's called get but instead it wraps the find() method doesn't it? Perhaps a point to discuss over on github?
Attachment #551523 - Flags: review?(peterbe) → review+
Going back and forth in my head, we may need to do all the tree/av, signoff/av/tree and action/av migrations in one go.

That's because we've got a few crutches in our data: The "Firefox Aurora" appversions and friends, together with the cloned sign-offs from there and the previous real appversion. There's just nothing to migrate the aurora data to, at least not without adding additional hacks.

Or maybe there is.

Recap, all the migrations:

1) find cloned signoffs, remove the older one (just to be safe)(needs data research)
  -- might also be right before 4
2) store the current mappings of tree and appversion
3) migrate signoffs
  a) add column (and constraints) for ForeignKey(Tree) (tbd: value or null)
  b) migrate data from av to tree
  c) drop column and constraints for ForeignKey(AppVersion)
  c) fix up default value or null, see a)
4) add column to Action, ForeignKey(AppVersion, null=True, blank=True)
  a) no migration needed (needs data research, if clones in 2 diverted, we'd need)
5ff) do the av/tree migration we have right now

Alternative plan is to migrate Aurora AVs to the corresponding release AVs, and see where that gets us. I'll look into that.

Changes to the signoff api algorithm:
If we're not good yet, recursively call into the sign-off api for the previous appversion.
Blocks: 690689
Blocks: 701932
This patch adds a command to develop that allows us to dump all sign-offs to a json file.

The main purpose of this command is to be able to verify that a sign-off-related data migration doesn't regress.

I'd like to land this command independently of that actual patch, as this is reall about being able to verify the data migration piece, aka, run this command, then run the migrations, run the command again. If the jsons differ, you're screwed, and you should've done backups before doing step 1 :-)
Attachment #604801 - Flags: review?(peterbe)
Comment on attachment 604801 [details] [diff] [review]
first step, command to dump signoffs for later regression checks

Review of attachment 604801 [details] [diff] [review]:
-----------------------------------------------------------------

code looks good
Attachment #604801 - Flags: review?(peterbe) → review+
Refactoring notes:

I'm introducing a fallback appversion in the data model, which tells which appversion a av should fall back to for sign-off data. Say, Fx12 would reference fx11. I prefer that explicit reference to making stuff implicit.

Now, the old shipping.api doesn't make that fallback, so let's take notes where we call into it, and where we want to have which. (Excluding tests, those don't mind)

shipping/management/commands/dump_signoffs.py(handle):53	shipping.api.accepted_signoffs
shipping/management/commands/merge_signoffs.py(handle):63	shipping.api.accepted_signoffs
shipping/management/commands/merge_signoffs.py(handle):64	shipping.api.accepted_signoffs
shipping/management/commands/transplant_signoffs.py(handle):60	shipping.api.accepted_signoffs
shipping/management/commands/update_l10n.py(handle):74	shipping.api.accepted_signoffs

The commands all want fallback.

shipping/views/__init__.py(teamsnippet):176	shipping.api.signoff_actions
shipping/views/__init__.py(teamsnippet):185	shipping.api.signoff_summary

The team snippet want to show fallback (yes, this should change how the team snippet looks).

shipping/views/__init__.py(confirm_ship_mstone):359	shipping.api.flag_lists

Should show fallback, changing the shown information.

shipping/views/outreach.py(data):131	shipping.api.accepted_signoffs

Changes to key off of fallback instead of dates. Pretty much a rewrite.

shipping/views/signoff.py(signoff):117	shipping.api.signoff_actions
shipping/views/signoff.py(signoff):123	shipping.api.signoff_summary
shipping/views/signoff.py(signoff):125	shipping.api.annotated_pushes

Doesn't want to fallback, but needs to show fallback. Probably want that below the data expander and a separator row.

shipping/views/status.py(data_for_avq):88	shipping.api.accepted_signoffs

This is l10n-changesets, and wants fallback

shipping/views/status.py(get_signoffs):288	shipping.api.flag_lists

This is that dashboard json status, and that wants to flag fallbacks.

Wow, this analysis really helped :-). accepted_signoffs wants fallback, the other calls don't, or rather want to have a new API that mentions fallback explicitly.
Blocks: 743928
Note to all: I'm going for a hack for a signoff on a fallback appversion that's not supposed to fallback:

A signoff on the new av with just a single OBSOLETED action.

Off to hack that into migration.
Commit pushed to develop at https://github.com/mozilla/elmo

https://github.com/mozilla/elmo/commit/f8812e9d58135efeafe04b6b3ac45c971c11d401
bug 650816, implement rapid-release data model, rs=peterbe.
This is a massive change rewriting the logic between AppVersion
and Tree to get from a 1-1 to a mapping over time.
Now that that's fixed, we're also adding views to do the channel
migration and milestone creation.
Blocks: 761657
Blocks: 761659
Blocks: 761662
Blocks: 761664
Blocks: 761669
Blocks: 761693
Blocks: 762551
Blocks: 763196
Blocks: 565640
Blocks: 763209
Blocks: 763214
Blocks: 763946
I think we got most of the regressions in check now.

Marking FIXED, yay.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: 1.3 → 2.2
Attachment #551523 - Flags: review?(stas)
Blocks: 669628
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: