update balrog blob schema to support multiple partials

RESOLVED FIXED

Status

Release Engineering
Balrog: Backend
P3
normal
RESOLVED FIXED
5 years ago
3 years ago

People

(Reporter: bhearsum, Assigned: bhearsum, Mentored)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(4 attachments, 3 obsolete attachments)

(Assignee)

Description

5 years ago
Currently the blob schema only supports an update type of "partial" or "complete" - and we can only have one of each. Somehow we need to jam multiple partials in there. Possibilities include:
* Use "partial-$version" instead of "partial" as the name, something like:
'af': {
  'partial-15.0b6': {
    'from': 'Firefox-15.0b6-build1',
    'filesize': 123
  }
  'complete': {
    'from': '*',
    'filesize': 9999
  }
}

* Insert another level between "partial" and the data, like we do with partialUpdates in the release configs:
'af': {
  'partial': {
    'Firefox-15.0b6-build1': {
      'filesize': 123
    }
  },
  'complete': {
    '*': {
      'filesize': 9999
    }
  }
}

* Put updates in a more generic block:
'af': {
  'updates': {
    'Firefox-15.0b6-build1': {
      'filesize': 123
    },
    '*': {
      'filesize': 9999
    }
  }
}


Might be more ways we could do it, too.
(Assignee)

Updated

5 years ago
Whiteboard: [balrog]
(Assignee)

Comment 1

5 years ago
Not going to be worked on just yet. Definitely needed before we're fully in production though.
Assignee: bhearsum → nobody
Priority: -- → P3
Product: mozilla.org → Release Engineering
(Assignee)

Comment 2

4 years ago
mass component change
Component: General Automation → Balrog: Backend
(Assignee)

Updated

4 years ago
Blocks: 933414
(Assignee)

Comment 3

4 years ago
Nick and I talked about to ease into this smoothly. If we do the second plan ("Insert another level...") we could support the old way and the new way for a period of time, migrate the data at our convenience, and then remove support for the old way. That would be much nicer and less risky than taking a downtime or doing a instant cut over.
(Assignee)

Updated

4 years ago
Blocks: 941949
(Assignee)

Updated

4 years ago
Whiteboard: [balrog] → [mentor=bhearsum]
(Assignee)

Updated

4 years ago
No longer blocks: 739959
(Assignee)

Comment 4

3 years ago
If we do plan #2, and we want to do bug 1018983, we should probably do it before or with this bug because it will affect the new key we add to the partial block. Eg, it would become:
'af': {
  'partial': {
    '20140601030204': {
      'filesize': 123
    }
  },
  'complete': {
    '*': {
      'filesize': 9999
    }
  }
}
Depends on: 1018983
(Assignee)

Comment 5

3 years ago
Something that we haven't addressed here is what to do with the -latest blobs for nightlies. In any of the current plans we'll end up with a bunch of old partials that won't apply. Eg, you start with:
'en-US': {
  'partial': {
    '20140601030303': {...}
  }
  'complete': {...}
}

And a few days later you'll have:
'en-US': {
  'partial': {
    '20140601030303': {...}
    '20140602030303': {...}
    '20140603030303': {...}
  }
  'complete': {...}
}

One idea to work around this is to remove all partials whenever the complete changes (where change is probably defined as touching the hash or filesize, maybe the URL).
I think we're OK on that front, submitting a single locale replaces the existing dict rather than using update(), see
  https://github.com/mozilla/balrog/blob/master/auslib/db.py#L902
(Assignee)

Comment 7

3 years ago
(In reply to Nick Thomas [:nthomas] from comment #6)
> I think we're OK on that front, submitting a single locale replaces the
> existing dict rather than using update(), see
>   https://github.com/mozilla/balrog/blob/master/auslib/db.py#L902

Ah, good point. That seems likethe right thing to do, too.
(Assignee)

Comment 8

3 years ago
(In reply to Ben Hearsum [:bhearsum] from comment #0)
> * Insert another level between "partial" and the data, like we do with
> partialUpdates in the release configs:
> 'af': {
>   'partial': {
>     'Firefox-15.0b6-build1': {
>       'filesize': 123
>     }
>   },
>   'complete': {
>     '*': {
>       'filesize': 9999
>     }
>   }
> }

Nick and I talked about this today and we settled on a tweaked version of the above. Specifically, "partial" will be "partials", and "complete" will be "completes", and they'll be lists:

'af': {
  'partials': [
    {
      'from': 'Firefox-15.0b6-build1',
      'filesize': 123
    },
    { ... }
  ],
  'completes': [
    {
      'from': '*',
      'filesize': 9999
    }
  ]
}

The reasoning for this is:
1) We can put the more recent partials earlier in the list, which will result in fewer overall db queries.
2) Making the complete into a list lets us us the same logic in processing them, and also allows us to support some crazy future where a broken version of a product must get a different complete than everyone else.
(Assignee)

Updated

3 years ago
Assignee: nobody → bhearsum
(Assignee)

Comment 9

3 years ago
(In reply to Ben Hearsum [:bhearsum] from comment #8)
> (In reply to Ben Hearsum [:bhearsum] from comment #0)
> > * Insert another level between "partial" and the data, like we do with
> > partialUpdates in the release configs:
> > 'af': {
> >   'partial': {
> >     'Firefox-15.0b6-build1': {
> >       'filesize': 123
> >     }
> >   },
> >   'complete': {
> >     '*': {
> >       'filesize': 9999
> >     }
> >   }
> > }
> 
> Nick and I talked about this today and we settled on a tweaked version of
> the above. Specifically, "partial" will be "partials", and "complete" will
> be "completes", and they'll be lists:
> 
> 'af': {
>   'partials': [
>     {
>       'from': 'Firefox-15.0b6-build1',
>       'filesize': 123
>     },
>     { ... }
>   ],
>   'completes': [
>     {
>       'from': '*',
>       'filesize': 9999
>     }
>   ]
> }
> 
> The reasoning for this is:
> 1) We can put the more recent partials earlier in the list, which will
> result in fewer overall db queries.
> 2) Making the complete into a list lets us us the same logic in processing
> them, and also allows us to support some crazy future where a broken version
> of a product must get a different complete than everyone else.

More concretely, it would make it easier to support a future situation where we need to repack the complete MAR like we used to when pushing releases to beta.
(Assignee)

Comment 10

3 years ago
Created attachment 8436083 [details] [diff] [review]
wip for multiple partial/complete support

This passes tests but it probably needs additiontal unit tests, and I haven't tested it on a running Balrog instance yet. I'm throwing it up mainly because I want to see if you're okay with the minimal approach I'm taking in AUS.py. I haven't spent any time thinking about how to make that better because I _think_ that code is moving/being rewritten when the XML generation moves to the blobs. If that's wrong I'll definitely need to spend more time on that here.

The patch is rather big, so here's an overview:
* Support lists in blobs. This turned out to be relatively easy to do.
* Create v3 schema blob. I factored out the common helper methods to a base class, like we talked about. I very briefly considered factoring out common parts of the schema, then remembered that it's crazy to do =).
** the v2 and v3 blobs have getApplicationVersion on them because of Python's MRO. The root Blob class also has that, so the v2 and v3 ones end up calling that one instead of the Mixin's. I'm considered dropping that method from the the root Blob (perhaps all of the business logic ones, in fact) because they probably won't be compatible with the H264 blob we'll be creating soon. What do you think?
* Minimal rework of partial/complete logic in expandRelease. All of the inner code is the same, it's just coping with both blob types by putting things in possiblePatches ahead of time and iterating over that.
* Drop support for CURRENT_SCHEMA_VERSION, because it won't make sense soon. I probably could've saved this for one of the h264 patches, but what's done is done...this mostly involved fixing tests to pass schema_version.
* I did the new tests in test_client.py to try and keep them useful after we move XML creation logic. However, despite my best efforts (removing the error handlers), debugging inner logic like XML generation is still tough from that level. I ended up needing to print my way to success when debugging. We may want to consider putting more tests in test_blob.py instead. Or maybe we want both? I've also been thinking about whether or not it makes sense to have aus-data-snapshots tests in addition to test_client.py ones.
** As part of the removal of error handlers I had to move the tests that make sure they work to another class.
** I also found a broken test because of this. Turns out that getURLNotInWhitelist doesn't work without error handlers. It threw an exception about the CEF log after I disabled them. Thankfully, that test passed after I mocked out the cef log.
Attachment #8436083 - Flags: feedback?(nthomas)
Comment on attachment 8436083 [details] [diff] [review]
wip for multiple partial/complete support

f+, very promising.

>diff --git a/auslib/AUS.py b/auslib/AUS.py

Looks fine when whitespace changes are excluded.

>diff --git a/auslib/admin/views/releases.py b/auslib/admin/views/releases.py
>-    try:
>+    if getattr(form.schema_version, "data", None):
>         # should be set here doing PUT at SingleLocaleView
>         # but some SingleLocaleView tests use the except

Nit, No 'except:' any more so stale comment.

>+    else:
>+        return Response(status=400, response="schema_version is required")

Much saner!

>diff --git a/auslib/blob.py b/auslib/blob.py
>+    # If the blob isn't a dictionary-like or list-like object, it's not valid!
>+    if (not hasattr(blob, 'keys') or not callable(blob.keys)) and \
>+       (not hasattr(blob, '__iter__') or not callable(blob.__iter__)):

Did you consider 'not isinstance(blob, dict)' and so on here ? 
From a quick read it looks format = [] is allowed ?

If we get sick of building our own validator there is https://pypi.python.org/pypi/jsonschema  (underlying project still at draft stage with IETF).

>+                        # TODO: add comment about why lists are desired

And responsibility of submitter to order sensibly. What could go wrong ....

>+    def __init__(self, **kwargs):
>+        # ensure schema_version is set if we init ReleaseBlobV2 directly

Nit, s/V2/V3/ in this ReleaseBlobV3 function comment.

>diff --git a/auslib/test/admin/views/test_releases.py b/auslib/test/admin/views/test_releases.py
>@@ -349,17 +349,17 @@ class TestReleasesAPI_JSON(ViewTest, JSONTestMixin):
>         data = json.dumps(dict(complete=dict(filesize=435)))
>         ret = self._put('/releases/a/builds/p/l', data=dict(data=data, product='b', version='a'))
>         self.assertStatusCode(ret, 400)

This one (testLocalePutCantChangeProduct) will be passing because we return 400 for missing schema_version, rather than permissions. So probably should add more schema_version=1 around the place to avoid this.
Attachment #8436083 - Flags: feedback?(nthomas) → feedback+
(In reply to Ben Hearsum [:bhearsum] from comment #10)
> This passes tests but it probably needs additional unit tests, and I
> haven't tested it on a running Balrog instance yet. I'm throwing it up
> mainly because I want to see if you're okay with the minimal approach I'm
> taking in AUS.py. I haven't spent any time thinking about how to make that
> better because I _think_ that code is moving/being rewritten when the XML
> generation moves to the blobs. 

That sounds fine to me. I didn't notice any test gaps as I read, but they're often hard to spot. Checking code coverage may help.

> ** the v2 and v3 blobs have getApplicationVersion on them because of
> Python's MRO. The root Blob class also has that, so the v2 and v3 ones end
> up calling that one instead of the Mixin's. I'm considered dropping that
> method from the the root Blob (perhaps all of the business logic ones, in
> fact) because they probably won't be compatible with the H264 blob we'll be
> creating soon. What do you think?

getApplicationVersion can probably go away when XML goes onto the blobs, maybe the rest too if they aren't called frequently. Removing it from the root class is fine by me.

> * Minimal rework of partial/complete logic in expandRelease. All of the
> inner code is the same, it's just coping with both blob types by putting
> things in possiblePatches ahead of time and iterating over that.

Does the if/if/if/if structure you used have some advantage over if/elif/elif/elif ?

> * I did the new tests in test_client.py to try and keep them useful after we
> move XML creation logic. However, despite my best efforts (removing the
> error handlers), debugging inner logic like XML generation is still tough
> from that level. I ended up needing to print my way to success when
> debugging. 

This is a bit nasty, and it seems to be worse when running tests than in the actual apps (where we get most of the debug output). Are we missing setting the logging level to debug or summink ?

> We may want to consider putting more tests in test_blob.py
> instead. Or maybe we want both? 

I'm not sure yet, tbh. Most likely need to try the refactoring a bit first.

> I've also been thinking about whether or not
> it makes sense to have aus-data-snapshots tests in addition to
> test_client.py ones.

What would we gain having aus-data-snapshots ?
(Assignee)

Comment 13

3 years ago
Thanks for your great feedback Nick.

(In reply to Nick Thomas [:nthomas] from comment #11)
> >diff --git a/auslib/blob.py b/auslib/blob.py
> >+    # If the blob isn't a dictionary-like or list-like object, it's not valid!
> >+    if (not hasattr(blob, 'keys') or not callable(blob.keys)) and \
> >+       (not hasattr(blob, '__iter__') or not callable(blob.__iter__)):
> 
> Did you consider 'not isinstance(blob, dict)' and so on here ? 

I didn't, but I am now. I've been doing inspection because it's in my head that it's better for duck-typed languages, but this is getting messy, and other people have suggested the same, too.

> From a quick read it looks format = [] is allowed ?

Hmm, probably. That seems bad, I'll fix that.

> If we get sick of building our own validator there is
> https://pypi.python.org/pypi/jsonschema  (underlying project still at draft
> stage with IETF).

I only looked at the pypi page, but it looks promising! I think I'll shy away from it for the moment to avoid delays, but I'll file a bug to look at it in the future.

> > * Minimal rework of partial/complete logic in expandRelease. All of the
> > inner code is the same, it's just coping with both blob types by putting
> > things in possiblePatches ahead of time and iterating over that.
> 
> Does the if/if/if/if structure you used have some advantage over
> if/elif/elif/elif ?

None at all, that's probably an artifact of copy/paste...
 
> > * I did the new tests in test_client.py to try and keep them useful after we
> > move XML creation logic. However, despite my best efforts (removing the
> > error handlers), debugging inner logic like XML generation is still tough
> > from that level. I ended up needing to print my way to success when
> > debugging. 
> 
> This is a bit nasty, and it seems to be worse when running tests than in the
> actual apps (where we get most of the debug output). Are we missing setting
> the logging level to debug or summink ?

It's possible we are somewhere, but I'm pretty sure we are in the test_client.py tests: https://github.com/mozilla/balrog/blob/master/auslib/test/web/test_client.py.

I didn't notice that the verbosity was different when running the app vs. tests, I'll look into this some more.

> > I've also been thinking about whether or not
> > it makes sense to have aus-data-snapshots tests in addition to
> > test_client.py ones.
> 
> What would we gain having aus-data-snapshots ?

I'm not sure now. Last week I *think* I was under the impression that they're the best place for full functionality tests, but that's probably just because that's where many of them happen to be. At this point I think aus-data-snapshots and test_client.py are basically the same thing in terms of what they can test. We just have more stuff in aus-data-snapshots because they predate test_client.py, I think?
(Assignee)

Comment 14

3 years ago
(In reply to Ben Hearsum [:bhearsum] from comment #13)
> > From a quick read it looks format = [] is allowed ?
> 
> Hmm, probably. That seems bad, I'll fix that.

I'm not sure this is fixable, actually...dicts are allowed to take iterables in order to map them to dictionary items (https://docs.python.org/2/library/stdtypes.html#mapping-types-dict).
(Assignee)

Comment 15

3 years ago
Created attachment 8438522 [details] [diff] [review]
support for multiple updates per type, comments addressed + tested

I've addressed all of your comments other than:
* Empty list blobs (see comment #14 - let me know if you think it should still be addressed)
* Verbosity of test_client.py tests - my own tests certainly got DEBUG logging, though I'm not sure if that's what you were saying now that I reread. The more I look at this though, I'm not sure what else could be done. The failures I was talking about end up with diffs of XML. The tricky part is tracing the code. Perhaps we just need additional DEBUG logging? I'm happy to do some work here before landing, too.

Other than addressing your comments, the only change I made was to alter ftpFilenames and bouncerProducts to support multiple updates - I'd forgotten that their structure is aligned with the locale level update one. This made me rework the structure of the expandRelease loop because we needed "partials" and "completes" as patch keys (rather than translating them to "partial" and "complete" like before). It's slightly nasty now, but bug 1021018 should clean it up without any extra work.

I did a ton of testing across products/build types, all of which reported to http://dev-master1.srv.releng.scl3.mozilla.com:9000/ (port 8000 for the product-facing app). I've got rules set-up to point at the various releases. Here's some sample URLs that show things working - I think they cover all the bases. Please ignore the fact that Thunderbird comm-central points at a dated blob...I misclicked and deleted the -latest blob by accident.
http://dev-master1.srv.releng.scl3.mozilla.com:8000/update/3/B2G/31.0a1/20140423040299/flame/en-US/nightly/2.0.0.0-prerelease/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8000/update/3/Fennec/31.0a1/20140423040299/Android_arm-eabi-gcc3/en-US/nightly/default/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8000/update/3/Firefox/31.0a1/20140423040299/Linux_x86_64-gcc3/en-US/nightly/default/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8000/update/3/Firefox/31.0a1/20140423040299/Linux_x86_64-gcc3/de/nightly/default/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8000/update/3/Thunderbird/31.0a1/20140423040299/Darwin_x86_64-gcc3-u-i386-x86_64/en-US/nightly/default/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8000/update/3/Thunderbird/31.0a1/20140423040299/Darwin_x86_64-gcc3-u-i386-x86_64/en-GB/aurora/default/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8000/update/3/Firefox/29.0/20140423040299/Linux_x86_64-gcc3/en-GB/beta/default/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8000/update/3/Firefox/29.0/20140423040299/Darwin_x86_64-gcc3-u-i386-x86_64/en-US/beta/default/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8000/update/3/Thunderbird/24.2.0/20140423040299/Darwin_x86_64-gcc3-u-i386-x86_64/en-GB/release/default/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8000/update/3/Thunderbird/24.2.0/20140423040299/Linux_x86_64-gcc3/en-US/release/default/default/default/update.xml?force=1

Unfortunately, we can't test ESR blobs on this side because of bug 1023962.
Attachment #8436083 - Attachment is obsolete: true
Attachment #8438522 - Flags: review?(nthomas)
(Assignee)

Comment 16

3 years ago
Created attachment 8438625 [details] [diff] [review]
refactor balrog submitter to make schema 3 easier

This is part 1 of the client side patch -- it's a kindof no-op refactor to let us support multiple schema versions more cleanly. Basically, I've created base classes for all of the helper classes that are tied to schema versions and moved the schema 2 specific stuff to a subclass. I ripped out schema v1 support at the same time because I didn't think it was needed anymore (dead code--) - I can add it back pretty easily if someone disagrees with that.

I'm hoping to land this in advance of any schema 3 work to minimize the surface area if something breaks here, or when schema 3 support is landed.

To test it, I resubmitted all the builds/repacks that I did while testing the server side patch. I sent those to a separate Balrog instead instead to avoid confusing the results. Its admin interface is at http://dev-master1.srv.releng.scl3.mozilla.com:9100 and here's the test URLs:
http://dev-master1.srv.releng.scl3.mozilla.com:8100/update/3/B2G/31.0a1/20140423040299/flame/en-US/nightly/2.0.0.0-prerelease/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8100/update/3/Fennec/31.0a1/20140423040299/Android_arm-eabi-gcc3/en-US/nightly/default/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8100/update/3/Firefox/31.0a1/20140423040299/Linux_x86_64-gcc3/en-US/nightly/default/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8100/update/3/Firefox/31.0a1/20140423040299/Linux_x86_64-gcc3/de/nightly/default/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8100/update/3/Thunderbird/31.0a1/20140423040299/Darwin_x86_64-gcc3-u-i386-x86_64/en-US/nightly/default/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8100/update/3/Thunderbird/31.0a1/20140423040299/Darwin_x86_64-gcc3-u-i386-x86_64/en-GB/aurora/default/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8100/update/3/Thunderbird/24.2.0/20140423040299/Linux_x86_64-gcc3/en-US/release/default/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8100/update/3/Firefox/29.0/20140423040299/Darwin_x86_64-gcc3-u-i386-x86_64/en-US/beta/default/default/default/update.xml?force=1
http://dev-master1.srv.releng.scl3.mozilla.com:8100/update/3/Firefox/29.0/20140423040299/Linux_x86_64-gcc3/en-GB/beta/default/default/default/update.xml?force=1

I'm not sure who I should be tagging for this, but you both have stake in it - feel free to fight over it ;).
Attachment #8438625 - Flags: review?(rail)
Attachment #8438625 - Flags: review?(nthomas)
Comment on attachment 8438625 [details] [diff] [review]
refactor balrog submitter to make schema 3 easier

Review of attachment 8438625 [details] [diff] [review]:
-----------------------------------------------------------------

::: scripts/updates/balrog-submitter.py
@@ +53,2 @@
>          }
> +        if options.schema_version == 2:

Isn't this check redundant? (there is parser.error() above)

Assuming this is where you plan plug-in v3.
Attachment #8438625 - Flags: review?(rail) → review+
(Assignee)

Comment 18

3 years ago
Comment on attachment 8438625 [details] [diff] [review]
refactor balrog submitter to make schema 3 easier

Review of attachment 8438625 [details] [diff] [review]:
-----------------------------------------------------------------

::: scripts/updates/balrog-submitter.py
@@ +53,2 @@
>          }
> +        if options.schema_version == 2:

At the moment, yeah. They'll be an elif or else in the v3 patch though. I guess I could've saved this for that...
(Assignee)

Comment 19

3 years ago
Created attachment 8440090 [details] [diff] [review]
schema3-implement-tools.diff

I'm tagging you both again because I'm not sure who is best.

This patch is implemented on top of the other tools patch, and adds schema 3 support to everything, changes the defaults on the two python scripts, and changes the release repack script to use schema 3 formatting. I've been testihng it extensively in yet another Balrog instance (http://dev-master1.srv.releng.scl3.mozilla.com:9200). Before I leave for the day I intend to empty out that database and fire off all the necessary jobs again, though -- I had to make an invasive change after beginning my testing, so I'd really like to retest it all again. I have yet to do any testing of V3ReleaseCreator, but I think that's the only major gap.

Most of this should be pretty straightforward. I decided to require that callers deal with the complexities of the "partialInfo" data structure. The alternative was to read it out of flat properties like we do for v2, but that was going to be a huge pain for a variable number of partials. And because we don't have a use case for it yet, I left the metadata gathering for the complete updates as is (that is, reading the existing properties rather than requiring completeInfo to be set).

The release l10n script needed to gather its own metadata, so I updated our fileInfo helper to support partial updates. It's not used much these days, and the unit tests pass, so I'm pretty sure that's safe.
Attachment #8440090 - Flags: review?(rail)
Attachment #8440090 - Flags: review?(nthomas)
(Assignee)

Comment 20

3 years ago
Created attachment 8440092 [details] [diff] [review]
buildbotcustom patch to set partialInfo

Not too much to say about this one...in each of NightlyBuildFactory, ReleaseBuildFactory, and NightlyRepackFactory it's combining a bunch of propperties into one. Note that a dict is being returned instead of a string, which means that the submitter scripts get a dict (instead of a json string)  when they read it back in.

ReleaseBuildFactory also had to deal with setting the info properties in the first place, because we don't do that yet (there was no reason to before).
Attachment #8440092 - Flags: review?(rail)
Attachment #8440092 - Flags: review?(nthomas)
Comment on attachment 8438522 [details] [diff] [review]
support for multiple updates per type, comments addressed + tested

r+, just one nit.

>diff --git a/auslib/test/web/test_client.py b/auslib/test/web/test_client.py
>+    def testSchema3BouncerSubstitutions(self):

This is testing filename subs ....

>+    def testSchema3FtpSubstitutions(self):

... and this one bouncer subs, so please swap names over on checkin.
Attachment #8438522 - Flags: review?(nthomas) → review+
Comment on attachment 8438625 [details] [diff] [review]
refactor balrog submitter to make schema 3 easier

Looks good. I'd slightly prefer the V2NightlySubmitter class be NightlySubmitterV2, to match NightlySubmitterBase. Not if invalidates all your testing though.
Attachment #8438625 - Flags: review?(nthomas) → review+
Comment on attachment 8440090 [details] [diff] [review]
schema3-implement-tools.diff

>diff --git a/lib/python/mozilla_buildtools/test/test_signing.py b/lib/python/mozilla_buildtools/test/test_signing.py
>     def testShortPathLocaleExe(self):
>         self._doFileInfoTest(
>             'foo/bar/firefox-3.0.12.es.win32.installer.exe',
>             'firefox',
>             dict(product='firefox', version='3.0.12', locale='es',
>                  platform='win32', contents='installer', format='exe',
>-                 pathstyle='short', leading_path=''))
>+                 pathstyle='short', leading_path='', previousVersion=None))

Shouldn't need to set 'previousVersion=None' for this installer case, others are legit.

Would be good to add some tests for the newly supported partial update case.

>-def retry(action, attempts=5, sleeptime=60, max_sleeptime=5 * 60,
>+def retry(action, attempts=1, sleeptime=60, max_sleeptime=5 * 60,
>           retry_exceptions=(Exception,), cleanup=None, args=(), kwargs={}):
>     """Call `action' a maximum of `attempts' times until it succeeds,
>         defaulting to 5. `sleeptime' is the number of seconds to wait

Is this an intended change ? Seems unrelated and is used all over. Comment needs updating if it's for real.

r+ to fix at your discretion.
Attachment #8440090 - Flags: review?(nthomas) → review+
Comment on attachment 8440092 [details] [diff] [review]
buildbotcustom patch to set partialInfo

I'll need to modify some of this for release builds on beta channel, namely supporting build1 -> buildN, but this looks good. Would love to get some more details on how your testing. The blobs seem fine from a quick eyeball, but I didn't set up rules in the 9200 instance to verify harder.
Attachment #8440092 - Flags: review?(nthomas) → review+
(Assignee)

Comment 25

3 years ago
My tests over the weekend found a couple of problems with the ReleaseCreator code, but other than that it worked great. I'll be updating my patches today after Rail's feedback on the current ones, and hope to have a landing plan figured out, too.
Comment on attachment 8440092 [details] [diff] [review]
buildbotcustom patch to set partialInfo

Review of attachment 8440092 [details] [diff] [review]:
-----------------------------------------------------------------

Woot, much more readable than the first version!
Attachment #8440092 - Flags: review?(rail) → review+
(Assignee)

Comment 27

3 years ago
(In reply to Nick Thomas [:nthomas] from comment #22)
> Comment on attachment 8438625 [details] [diff] [review]
> refactor balrog submitter to make schema 3 easier
> 
> Looks good. I'd slightly prefer the V2NightlySubmitter class be
> NightlySubmitterV2, to match NightlySubmitterBase. Not if invalidates all
> your testing though.

I've made this change to everything - very simple and straightforward rename.

> >-def retry(action, attempts=5, sleeptime=60, max_sleeptime=5 * 60,
> >+def retry(action, attempts=1, sleeptime=60, max_sleeptime=5 * 60,
> >           retry_exceptions=(Exception,), cleanup=None, args=(), kwargs={}):
> >     """Call `action' a maximum of `attempts' times until it succeeds,
> >         defaulting to 5. `sleeptime' is the number of seconds to wait
> 
> Is this an intended change ? Seems unrelated and is used all over. Comment
> needs updating if it's for real.

Whoops, this was just to speed-up testing. The backoff of retry can make some types of submissions take a very long time to spit out a useful message.
Comment on attachment 8440090 [details] [diff] [review]
schema3-implement-tools.diff

Review of attachment 8440090 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm, will be glad to review the interdiff

::: lib/python/release/info.py
@@ +150,5 @@
>          firefox 3.0 style filenames), or if the locale names are part of the
>          directory structure, but not the file name itself ('long' paths,
>          firefox 3.5+ style filenames)
>      """
> +    # TODO: delete me?

why not? :)

::: lib/python/util/retry.py
@@ +4,5 @@
>  import logging
>  log = logging.getLogger(__name__)
>  
>  
> +def retry(action, attempts=1, sleeptime=60, max_sleeptime=5 * 60,

I think this is your staging optimization...
Attachment #8440090 - Flags: review?(rail) → review+
(Assignee)

Comment 29

3 years ago
Comment on attachment 8438522 [details] [diff] [review]
support for multiple updates per type, comments addressed + tested

(In reply to Nick Thomas [:nthomas] from comment #21)
> Comment on attachment 8438522 [details] [diff] [review]
> support for multiple updates per type, comments addressed + tested
> 
> r+, just one nit.
> 
> >diff --git a/auslib/test/web/test_client.py b/auslib/test/web/test_client.py
> >+    def testSchema3BouncerSubstitutions(self):
> 
> This is testing filename subs ....
> 
> >+    def testSchema3FtpSubstitutions(self):
> 
> ... and this one bouncer subs, so please swap names over on checkin.

Whoops. I fixed these.

I also found a case where fileUrls could cause an exception to be thrown and added better logging for it: https://github.com/bhearsum/balrog/commit/2420d332c5fd19dca4f5bf67ffcfdadfe6520b5c
Attachment #8438522 - Flags: checked-in+

Comment 30

3 years ago
Commit pushed to master at https://github.com/mozilla/balrog

https://github.com/mozilla/balrog/commit/787c7b03f3a25cf05fea38165665ffee68f7af72
bug 797033: update balrog blob schema to support multiple partials. r=nthomas
(Assignee)

Comment 31

3 years ago
Created attachment 8440775 [details] [diff] [review]
refactor patch w/ fixes

Meant to update this one on Friday. I didn't find any issues with it, this is just the more complete version that also refactors ReleaseCreator and ReleaseSubmitter.
Attachment #8438625 - Attachment is obsolete: true
Attachment #8440775 - Flags: review?(rail)
(Assignee)

Comment 32

3 years ago
Created attachment 8440777 [details] [diff] [review]
small fixes for schema3 tools implementation

Fixes here:
* Arg pasing in release creator
* fileInfo stuff that Nick mentioned
* Remove unwanted change to retry.py
Attachment #8440090 - Attachment is obsolete: true
Attachment #8440777 - Flags: review?(rail)
(Assignee)

Updated

3 years ago
Depends on: 1026015
(Assignee)

Comment 33

3 years ago
Comment on attachment 8440092 [details] [diff] [review]
buildbotcustom patch to set partialInfo

This patch had no issues in testing, and doesn't depend on anything. I've landed it.
Attachment #8440092 - Flags: checked-in+
Attachment #8440777 - Flags: review?(rail) → review+
Attachment #8440775 - Flags: review?(rail) → review+
(Assignee)

Updated

3 years ago
Attachment #8440775 - Flags: checked-in+
(Assignee)

Updated

3 years ago
Attachment #8440777 - Flags: checked-in+
(Assignee)

Comment 34

3 years ago
I'm about to land the changes to use schema version 3. This is the last Beta that I'll be around for, so I want to make sure they get in before it starts. Just like when schema 2 was landed, I'll be locking all rules pointed at -latest blobs into dated ones, deleting the -latest ones, and then moving them back tomorrow morning when the nightlies finish.

Here's the full schedule for landing:
* Lock rules to dated blobs (done)
* Delete -latest blobs (done)
* Reconfig for buildbotcustom patch (done)
* Land tools patches (done)
* Wait for nightlies
* Move test channels to -latest blobs for a quick sanity check
* Move live channels back (B2G ones tonight, everything else tomorrow morning)

There's still some aurora l10n nightlies pending, so I'll probably use the auroratest channel to do some very early testing today.
(Assignee)

Comment 35

3 years ago
(In reply to Ben Hearsum [:bhearsum] from comment #34)
> 
> There's still some aurora l10n nightlies pending, so I'll probably use the
> auroratest channel to do some very early testing today.

I was able to test a Linux aurora update on auroratest. It's working fine :). Still the following three things to do:
* Wait for nightlies
* Move test channels to -latest blobs for a quick sanity check
* Move live channels back (B2G ones tonight, everything else tomorrow morning)
buildbotcustom patch live in production :)
4x rules for B2g set back to latest blobs - 2x m-c, m-a, and b2g30_v1_4. m-c update worked fine on flame.
(Assignee)

Comment 38

3 years ago
Still waiting for Fennce/Firefox/Thunderbird nightlies to finish to switch those rules back to -latest.

In the meantime, I looked at some test URLs for Firefox 31.0b2:
https://aus4.mozilla.org/update/3/Firefox/31.0/20140610163407/Linux_x86_64-gcc3/en-US/betatest/Linux%203.11.0-17-generic%20%28GTK%202.24.20%29/default/default/update.xml?force=1
https://aus4.mozilla.org/update/3/Firefox/30.0/20140529161749/Linux_x86_64-gcc3/en-US/betatest/Linux%203.11.0-17-generic%20%28GTK%202.24.20%29/default/default/update.xml?force=1
https://aus4.mozilla.org/update/3/Firefox/30.0/20140525161749/Linux_x86_64-gcc3/en-US/betatest/Linux%203.11.0-17-generic%20%28GTK%202.24.20%29/default/default/update.xml?force=1

The first two get the correct partials for their version, and the last one only got a complete \o/. There's a lot more verification to be done there (probably an entire update verify run), but this is a good sign.
(Assignee)

Comment 39

3 years ago
All nightly updates have been re-enabled EXCEPT comm-central and mozilla-aurora, which are still waiting for en-US builds to pop out.
Mentor: bhearsum@mozilla.com
Whiteboard: [mentor=bhearsum]
(Assignee)

Comment 40

3 years ago
The Windows Aurora build continues to fail. I moved those rules back to -latest anyways, because I don't think we should block everyone else on Windows. Whenever that build finally succeeds, users will get an update to it.

The only thing left here now is to wait for all of the update verify tests I've kicked off to finish. They're running on my dev master, and looking great so far:
http://dev-master1.srv.releng.scl3.mozilla.com:8018/one_line_per_build?numbuilds=100

They've been testing both betatest and releasetest, using the configs generated for 31.0b2 (except with the aus_server switched to aus4.mozilla.org, of course).
I have re-enabled all update tests for Mozmill across branches.
(Assignee)

Comment 42

3 years ago
All of my update tests passed, and some ad-hoc testing on Nightly showed partials to be working there. I'm calling this fixed. If we find any issues let's file new bugs for them.
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
All real update tests with Mozmill also passed. No single failure. Good work Ben!
You need to log in before you can comment on or make changes to this bug.