Closed Bug 1021018 Opened 10 years ago Closed 10 years ago

XML generation should be a function of blobs, not handled externally

Categories

(Release Engineering Graveyard :: Applications: Balrog (backend), defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Assigned: bhearsum)

References

()

Details

Attachments

(1 file, 4 obsolete files)

XML generation involves a lot of blob-specific stuff. Right now we handle that with if statements in AUS.py, but it would be much more elegant if we could simply call a method on any blob and have it return the right thing.
Blocks: 886543
I think this is going to have great influence on how bug 1021032 is handled...
Blocks: 1021032
I'm making decent progress on this. I'm about 3/4 of the way to "making it work", but I still need some clean-up after that to make the patch not awful. I should have something ready for review before EOW, hopefully tomorrow.
Assignee: nobody → bhearsum
First off, apologies for the massive patch. I should've split this into 2 or 3 different ones. This patch also isn't quite done yet - there's still a couple TODOs, and some reworking of blob.py (to reduce duplication) that I need to do. But before this gets too out of hand I wanted to put it up.

Here's the overview:
* Get rid of app-specific database objects (auslib.admin.base.db and the db object on the AUS class). This is necessary because the Blobs now need to talk to the database, and blobs aren't app-specific. Nearly all of the changes to files other than AUS.py and blob.py are to support this. I'd like to think that this is an improvement regardless, but I'm a bit biased right now.
* Move domain whitelist and special force hosts into the webapp config. I'm not 100% sure this is the right thing to do, but I couldn't find a better place to put them for non-admin app. You'll also notice that db.py and blob.py require those things to be passed in rather than looking at the webapp config themselves. I did this to try and keep the controller and view layers more isolated. I'm open to change there if you think it's more bad than good.
* Migrate most of the test_AUS.py tests to test_client.py, and add a few more where I found coverage lacking (mostly around schema2 blob testing).
* Move helper functions (isSpecialURL, containsForbiddenDomain, getFallbackChannel) out of AUS class. This was necessary because they're mostly used by the blob now. I kept them in AUS.py because that felt like the right place for non-blob specific business logic. I'm open to other ideas too.
* Rip out expandRelease, createSnippet*, and createXML from AUS; move into Blobs. This naturally resulted in a bit better organization (because we have one blob class per schema), but also a lot of duplication. I think I can reduce most of that by using the same or similar pattern as the client code (https://github.com/mozilla/build-tools/blob/master/lib/python/balrog/submitter/cli.py). For now, there's a lot of things attached to the base Blob class. When we implement an H264 blob class I expect most of that to move to a class that will serve as the base for the schema 1/2/3 blobs, since most of it won't apply to the new H264 blob.
Attachment #8453895 - Flags: feedback?(nthomas)
Comment on attachment 8453895 [details] [diff] [review]
move xml generation to blobs, and more

Review of attachment 8453895 [details] [diff] [review]:
-----------------------------------------------------------------

I think you're headed in the right direction. If we can squash some of the duplication in the createXML() while not diving off into lots of little helper functions then it's looking good.

::: auslib/blob.py
@@ +134,5 @@
>              return self['platforms'][platform]['locales'][locale]['buildID']
>          except KeyError:
>              return self['platforms'][platform]['buildID']
>  
> +    def getUrl(self, updateQuery, patch, specialForceHosts, ftpFilename=None, bouncerProduct=None):

Are you thinking ahead to openh264 here by making ftpFilename and bouncerProduct optional ?
Attachment #8453895 - Flags: feedback?(nthomas) → feedback+
(In reply to Nick Thomas [:nthomas] from comment #5)
> Comment on attachment 8453895 [details] [diff] [review]
> move xml generation to blobs, and more
> 
> Review of attachment 8453895 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I think you're headed in the right direction. If we can squash some of the
> duplication in the createXML() while not diving off into lots of little
> helper functions then it's looking good.
> 
> ::: auslib/blob.py
> @@ +134,5 @@
> >              return self['platforms'][platform]['locales'][locale]['buildID']
> >          except KeyError:
> >              return self['platforms'][platform]['buildID']
> >  
> > +    def getUrl(self, updateQuery, patch, specialForceHosts, ftpFilename=None, bouncerProduct=None):
> 
> Are you thinking ahead to openh264 here by making ftpFilename and
> bouncerProduct optional ?

I don't think I was considering that...I may have thought that there may be a use case within the existing blobs for them to be optional. I think this is a method that may not end up being relevant to openh264, because it relies on the ReleaseBlob* structure ("platforms", specifically).
Here's a much more factored version of this patch. I'm not entirely happy with the method names, so if you have better ideas I'm happy to hear them. You might find different, but I found interdiff much harder to read than the new blob.py by itself.

I also have an earlier version of this where getUpdateLineXML doesn't exist (which means there's a createXML on BlobV1 and the NewStyleVersionsMixin, and duplication of the final xml construction). I'm not sure where best to draw the line on duplication vs. indirectness. Any thoughts on that?
Attachment #8453895 - Attachment is obsolete: true
Attachment #8455381 - Flags: review?(nthomas)
When I started thinking about actually implementing an h264/gmp blob, I realized that part of our rule matching depends on blob structure too. This patches moves that out to the blob. Moving the "don't serve an update if updateQuery is the same or newer" is also good because we don't want that behaviour for the h264/gmp blobs.
Attachment #8455381 - Attachment is obsolete: true
Attachment #8455381 - Flags: review?(nthomas)
Attachment #8456925 - Flags: review?(nthomas)
I will review this first thing next week, because the hg server is going to behave then, and the release automation will all be ponies and rainbows.
Comment on attachment 8456925 [details] [diff] [review]
move blob-specific rule matching tests to blobs, too

Review of attachment 8456925 [details] [diff] [review]:
-----------------------------------------------------------------

This is a first pass that covered almost everything, I'd just like to look at the tests again with a fresh brain. It's reassuring that the coverage is unchanged with all the moving around, and pretty high in general, and I'm pretty happy with the patch.

In blob.py, I see you've worked hard to squash code duplication (excepting createSnippet) but found it hard to trace the code execution. eg, for some release we're going to serve the calls go:
  release.createXML()                           - from Blob
    self.getUpdateLinXML()                      - from ReleaseBlobV1, or NewStyleVersionsMixin
    self.getPatchesXML()                        - from SingleUpdateXMLMixin, or MultipleUpdatesXMLMixin
      self.getSpecificPatchXML()                - from Blob
        self.getFtpFilename/getBouncerFilename  - from SingleUpdateXMLMixin, or MultipleUpdatesXMLMixin
Some documentation on how the code is structured may help orient newcomers, and remind me in the likely event I've forgotten since last time I read it.

::: admin.wsgi
@@ +30,5 @@
>  auslib.log.cef_config = auslib.log.get_cef_config(cfg.getCefLogfile())
> +dbo.setDb(cfg.getDburi())
> +dbo.setupChangeMonitors(cfg.getSystemAccounts())
> +dbo.setDomainWhitelist(cfg.getDomainWhitelist())
> +application.config['WHITELISTED_DOMAINS'] = cfg.getDomainWhitelist()

Setting this twice is a little odd. Perhaps we can look at moving containsForbiddenDomain() into the code that handles API calls. Fodder for a followup if it's worthwhile.

::: auslib/__init__.py
@@ +1,3 @@
>  version = "0.5"
> +
> +# TODO: comment about why this is needed

Please resolve this TODO.

::: auslib/blob.py
@@ +74,5 @@
>      """See isValidBlob for details on how format is used to validate blobs."""
>      format_ = {}
>  
> +    def __init__(self, *args, **kwargs):
> +        self.log = logging.getLogger(self.__class__.__name__)

Hopefully this will fix the logging issue I mention on and off. Woo!

@@ +212,5 @@
> +        buildTarget = updateQuery['buildTarget']
> +        locale = updateQuery['locale']
> +        releaseVersion = self.getApplicationVersion(buildTarget, locale)
> +        if not releaseVersion:
> +            self.log.debug("Matching rule has no extv, will not serve update.")

While we're moving this, extv only applies to schema v1 so please update comment to something more generic.

@@ +313,5 @@
>          """ We used extv as the application version for v1 schema, while appv
>          may have been a pretty version for users to see"""
>          return self.getExtv(platform, locale)
>  
> +    def createSnippets(self, updateQuery, update_type, whitelistedDomains, specialForceHosts):

We can probably remove these functions some time after making a full transition to Balrog. Might be some functionality the snippets are still checking that unit tests aren't.

@@ +670,5 @@
>          if 'schema_version' not in self.keys():
>              self['schema_version'] = 3
> +
> +    def createSnippets(self, updateQuery, update_type, whitelistedDomains, specialForceHosts):
> +        # We have no tests that require this, probably not worthwhile to implement.

Fair enough.

::: auslib/test/web/test_client.py
@@ +528,5 @@
>          self.assertEqual(ret.status_code, 200)
>          self.assertEqual(ret.mimetype, 'text/plain')
>          self.assertTrue('User-agent' in ret.data)
>  
> +    def testSchema2CompleteOnly(self):

What do you think about putting these v2 schema tests in test_blob.py instead ? The large size of the setup() for this set of tests is rather daunting already. Separate test classes would be another option.
(In reply to Nick Thomas [:nthomas] from comment #10)
> Comment on attachment 8456925 [details] [diff] [review]
> move blob-specific rule matching tests to blobs, too
> 
> Review of attachment 8456925 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This is a first pass that covered almost everything, I'd just like to look
> at the tests again with a fresh brain. It's reassuring that the coverage is
> unchanged with all the moving around, and pretty high in general, and I'm
> pretty happy with the patch.
> 
> In blob.py, I see you've worked hard to squash code duplication (excepting
> createSnippet) but found it hard to trace the code execution. eg, for some
> release we're going to serve the calls go:
>   release.createXML()                           - from Blob
>     self.getUpdateLinXML()                      - from ReleaseBlobV1, or
> NewStyleVersionsMixin
>     self.getPatchesXML()                        - from SingleUpdateXMLMixin,
> or MultipleUpdatesXMLMixin
>       self.getSpecificPatchXML()                - from Blob
>         self.getFtpFilename/getBouncerFilename  - from SingleUpdateXMLMixin,
> or MultipleUpdatesXMLMixin
> Some documentation on how the code is structured may help orient newcomers,
> and remind me in the likely event I've forgotten since last time I read it.

I can certainly do that - I don't find it 100% clear either. I'm not 100% happy with it, but because of the multiple different combinations of things (single update is v1/v2, multiple update is v3, but appv/extv is v1 and new versions are v2/v3) I couldn't find a way to keep things factored much at all without this amount of indirection. Eg, getFtpFilename can't be unfactored without getSpecificPatchXML also being unfactored.

> ::: admin.wsgi
> @@ +30,5 @@
> >  auslib.log.cef_config = auslib.log.get_cef_config(cfg.getCefLogfile())
> > +dbo.setDb(cfg.getDburi())
> > +dbo.setupChangeMonitors(cfg.getSystemAccounts())
> > +dbo.setDomainWhitelist(cfg.getDomainWhitelist())
> > +application.config['WHITELISTED_DOMAINS'] = cfg.getDomainWhitelist()
> 
> Setting this twice is a little odd. Perhaps we can look at moving
> containsForbiddenDomain() into the code that handles API calls. Fodder for a
> followup if it's worthwhile.

Yeah, it's definitely odd. My reasoning here (which I should write down in the repo somewhere) is that I'm keeping the models/business logic (db.py, blobs/, AUS.py) from depending directly on the web layer.

ent to something more generic.
> 
> @@ +313,5 @@
> >          """ We used extv as the application version for v1 schema, while appv
> >          may have been a pretty version for users to see"""
> >          return self.getExtv(platform, locale)
> >  
> > +    def createSnippets(self, updateQuery, update_type, whitelistedDomains, specialForceHosts):
> 
> We can probably remove these functions some time after making a full
> transition to Balrog. Might be some functionality the snippets are still
> checking that unit tests aren't.

I had similar thoughts. Major updates and some fallback channel logic are only tested in snippet tests AFAIK.


> ::: auslib/test/web/test_client.py
> @@ +528,5 @@
> >          self.assertEqual(ret.status_code, 200)
> >          self.assertEqual(ret.mimetype, 'text/plain')
> >          self.assertTrue('User-agent' in ret.data)
> >  
> > +    def testSchema2CompleteOnly(self):
> 
> What do you think about putting these v2 schema tests in test_blob.py
> instead ? The large size of the setup() for this set of tests is rather
> daunting already. Separate test classes would be another option.

That seems like a good idea. These tests were originally here because they depended on more than the blobs. I think we need a couple of tests here, just as a sanity check on ClientRequestView.get, but the large majority of these probably do belong on the blob classes. I'll make that change.
Attached patch fix up comments, tests (obsolete) — Splinter Review
I think I've addressed everything you've identified. A couple of additional notes:
* I prepended the inner XML generation methods with "_" to try and help make it clearer that they're not part of the interface.
* I moved the & fixing into createXML, mainly because it made it easier to move the tests to the blob classes.
Attachment #8456925 - Attachment is obsolete: true
Attachment #8456925 - Flags: review?(nthomas)
Attachment #8459739 - Flags: review?(nthomas)
Comment on attachment 8459739 [details] [diff] [review]
fix up comments, tests

Review of attachment 8459739 [details] [diff] [review]:
-----------------------------------------------------------------

Looks great. The changes to auslib/db.py aren't in this patch, not sure if that was intentional, so a little cleanup lost. Otherwise just some comments.

::: auslib/blob.py
@@ +193,5 @@
> +            (patchType, url, self["hashFunction"], patch["hashValue"], patch["filesize"])
> +
> +    def createXML(self, updateQuery, update_type, whitelistedDomains, specialForceHosts):
> +        """This method is the entry point for update XML creation for all Gecko
> +           app blobs. However, the XML and underlying data has changed over

Nice

::: auslib/test/test_blob.py
@@ +273,5 @@
> +            "product": "h", "version": "0.5", "buildID": "0",
> +            "buildTarget": "p", "locale": "m", "channel": "a",
> +            "osVersion": "a", "distribution": "a", "distVersion": "a",
> +            "force": 0
> +        }

Looking at this afresh, it's not a very strong test - just that the whole shooting match doesn't fall over if special hosts aren't defined. For completeness we should check the forcing case too. You can file a bug 'improve test coverage to 100%' and foist it on me if you want. :-P

::: auslib/web/views/client.py
@@ +39,5 @@
> +        else:
> +            xml = ['<?xml version="1.0"?>']
> +            xml.append('<updates>')
> +            xml.append('</updates>')
> +            xml = "\n".join(xml)

We also do this in auslib/web/base.py a few times, we could stick stash it in a constant at some point. Relatedly, we could create a unittest class assertEmptyUpdate based on assertEqual(<test_output>, <empty_update_constant>).
Attachment #8459739 - Flags: review?(nthomas) → review+
(In reply to Nick Thomas [:nthomas] from comment #13)
> Comment on attachment 8459739 [details] [diff] [review]
> fix up comments, tests
> 
> Review of attachment 8459739 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks great. The changes to auslib/db.py aren't in this patch, not sure if
> that was intentional, so a little cleanup lost. Otherwise just some comments.

Whoops - my bad. I had reverted that because my brain thought it was a mistake. I'll revert that commit (https://github.com/bhearsum/balrog/commit/e5e3007166dbd4b5dd14374283ad04a1852d5ba3) before I merge this in.

> ::: auslib/test/test_blob.py
> @@ +273,5 @@
> > +            "product": "h", "version": "0.5", "buildID": "0",
> > +            "buildTarget": "p", "locale": "m", "channel": "a",
> > +            "osVersion": "a", "distribution": "a", "distVersion": "a",
> > +            "force": 0
> > +        }
> 
> Looking at this afresh, it's not a very strong test - just that the whole
> shooting match doesn't fall over if special hosts aren't defined. For
> completeness we should check the forcing case too. You can file a bug
> 'improve test coverage to 100%' and foist it on me if you want. :-P

I won't refuse this offer - bug 1042207.

> ::: auslib/web/views/client.py
> @@ +39,5 @@
> > +        else:
> > +            xml = ['<?xml version="1.0"?>']
> > +            xml.append('<updates>')
> > +            xml.append('</updates>')
> > +            xml = "\n".join(xml)
> 
> We also do this in auslib/web/base.py a few times, we could stick stash it
> in a constant at some point. Relatedly, we could create a unittest class
> assertEmptyUpdate based on assertEqual(<test_output>,
> <empty_update_constant>).

And bug 1042212 for this.
Comment on attachment 8459739 [details] [diff] [review]
fix up comments, tests

I've merged this to master now. Once we're clear of 32.0b1 shipping, I'm going to have this pushed to production, hopefully tomorrow or Thursday morning.
Attachment #8459739 - Flags: checked-in+
Commit pushed to master at https://github.com/mozilla/balrog

https://github.com/mozilla/balrog/commit/29cab31ea513edb41bb365fcdf1940380a4eaf8d
bug 1021018: XML generation should be a function of blobs, not handled externally. r=nthomas
Commit pushed to master at https://github.com/mozilla/balrog

https://github.com/mozilla/balrog/commit/d50621ff77d8f0852a85a159ae21230e5b456a5f
Backout bug 1021018 because it's breaking in dev.

This reverts commit 29cab31ea513edb41bb365fcdf1940380a4eaf8d.
Comment on attachment 8459739 [details] [diff] [review]
fix up comments, tests

I had to back this out because it's breaking in aus4-dev:
https://errormill.mozilla.org/Releng/balrog-dev/group/170608/
https://errormill.mozilla.org/Releng/balrog-dev/group/170607/
Attachment #8459739 - Flags: checked-in+ → checked-in-
These errors aren't as bad as I first thought. The one described at https://errormill.mozilla.org/Releng/balrog-dev/group/170607/ is actually an issue with the blob data -- it's a release blob that only has file information. Ie, top level stuff hasn't been populated. That's a problem, but not with this patch. We should probably have some sort of check that make sure a blob is actually fully formed before you're allowed to point a rule at it - I added a note to bug 703040 about that.

The issue with https://errormill.mozilla.org/Releng/balrog-dev/group/170607/ is a real one. Right now, MultipleUpdatesXMLMixin._getPatchesXML assumes that you always have both a "partials" and a "completes" block in each locale. This is generally true, but not something that's guaranteed.

I'll put up a new patch that fixes this and add some tests.
This is the full patch, which probably isn't very helpful to you in reviewing. Here's the diff against what was landed (which was the same as the previous posted patch except it had the domainWhitelist cleanup): https://github.com/bhearsum/balrog/compare/d765bebc9d284426c0f11b2f567ec9f093dc7b7b...blob-xml-2

So, I fixed the bug. It only affects blobs that use multiple updates (ie, only v3 right now). The equivalent code for single update blobs is slightly different, and doesn't suffer from the bug. I checked though, and there are tests in the v2 blob that point to blobs with only a complete block in them.

None of the TestSchema2BlobNightlyStyle tests are relevant to this bug, but I figured I'd add them since we discovered that gap yesterday. The existing Schema 3 tests have release and nightly style, so I didn't add anything there besides the NoPartialBlock test.
Attachment #8459739 - Attachment is obsolete: true
Attachment #8462561 - Flags: review?(nthomas)
Comment on attachment 8462561 [details] [diff] [review]
fix bug + add tests

Thanks for the interdiff, I just looked at that. r+
Attachment #8462561 - Flags: review?(nthomas) → review+
Comment on attachment 8462561 [details] [diff] [review]
fix bug + add tests

Landed. Going to poke around in dev some more before asking for a push.
Attachment #8462561 - Flags: checked-in+
Commit pushed to master at https://github.com/mozilla/balrog

https://github.com/mozilla/balrog/commit/9c05e16edc0c981c1fbbc9b58f74bb3b3cf0e10c
bug 1021018: XML generation should be a function of blobs, not handled externally. r=nthomas
Depends on: 1045140
This has landed in production, and looks good.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
I'm pretty sure this caused bug 1074362, which isn't super serious, but needs to be addressed at some point.
Product: Release Engineering → Release Engineering Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: