Closed Bug 1364564 Opened 7 years ago Closed 7 years ago

Prototype a service that pushes wpt github PR to try server

Categories

(Testing :: web-platform-tests, enhancement)

Version 3
enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: impossibus, Assigned: impossibus)

References

Details

Attachments

(1 file)

* Detect new PR on https://github.com/w3c/web-platform-tests via github webhook * Prefix changed files with testing/web-platform/tests/ so the patch can be pushed to try - Generate a commit message that will be useful in Mozilla context * Schedule a try run of affected tests only * Post summary of results as a comment on Github PR My goal here to is deploy a simple service to get early feedback and approximate what we'll need for downstreaming web-platform-tests.
Based on http://mozilla-version-control-tools.readthedocs.io/en/latest/githubwebhooks.html, it might make sense to take advantage existing infrastructure for broadcasting Github events. gps, could we add w3c/web-platform-tests to the service that already publishes Mozilla github repo events to Pulse?
Flags: needinfo?(gps)
Assignee: nobody → mjzffr
If you configure a GitHub webhook to send events to https://3abyt2fapj.execute-api.us-west-2.amazonaws.com/prod/webhook, things should end up in Pulse. Although there's a slight chance we may filter events somewhere. If we do, it is easy enough to add an allow rule. FWIW I think we have a bug floating around to use a proper hostname for this ingestion URL. If the W3C drags that feet about that opaque URL, we could probably roll out a hostname with "moz" in it.
Flags: needinfo?(gps)
I can add that for now but I would prefer it to be on a more reasonable domain.
Looks like that works, thanks! I've also made my fork, mjzffr/web-platform-tests, send events to that same URL so I can do some testing.
I just published a quite crude WIP: pulse consumer that pushes a github PR to try using only git. While I work on refining the prototype, I'd like to also start deploying iterations of it to an EC2 instance to get a better idea of what's involved there; perhaps eventually investigate Amazon SQS for durable queues. I propose that wpt-sync operations should be managed in largely the same way as vcs-sync/servo-sync, taking advantage of the same deployment tool chain, running under the "Developer Services" umbrella. (We had previously discussed deploying to Heroku for now, but given that the sync service needs to clone large repos (m-c and wpt), my impression is that Heroku dynos aren't a good option.) gps, does that plan sound reasonable? If so, what would the next steps be as far as accounts and setup? (I have an idea of what to do with terraform configs, ansible playbooks, but where? with what credentials?)
Flags: needinfo?(gps)
Comment on attachment 8873181 [details] Bug 1364564 - WIP downstream sync wpt git-only https://reviewboard.mozilla.org/r/144646/#review149186 ::: vcssync/mozvcssync/wpt.py:113 (Diff revision 1) > + > +def get_pr(git_source_url, git_repo_path, pr_id, ref='master'): > + """ Pull shallow repo and checkout given pr """ > + git_repo_path = os.path.abspath(git_repo_path) > + if not os.path.exists(git_repo_path): > + subprocess.check_call([b'git', b'init', git_repo_path]) So, in the long term it seems like we might have multiple parts of this system that all want to interact with git. Should we assume that each will maintain a separate clone, or should we consider putting access to git behind a lock? ::: vcssync/mozvcssync/wpt.py:119 (Diff revision 1) > + > + subprocess.check_call([b'git', b'checkout', b'master'], > + cwd=git_repo_path) > + > + subprocess.check_call([b'git', b'clean', b'-xdf'], cwd=git_repo_path) > + subprocess.check_call([b'git', b'pull', b'--no-tags', b'--depth', b'50', Any specific reason for a shallow clone here? If we can deploy this in a way that allows us to maintain the git repository, then we can probably afford a full clone to avoid any limitations of a shallow one. ::: vcssync/mozvcssync/wpt.py:154 (Diff revision 1) > + > + if os.path.exists(dest): > + assert os.path.isdir(dest) > + shutil.rmtree(dest) > + > + shutil.copytree(source, dest, ignore=shutil.ignore_patterns('.git')) Presumably the plan is to replace this with something more advanced, because we might have unupstreamed local modifications. Probably we want to copy this on a branch and then test the merge of that branch into the local master?
(In reply to Maja Frydrychowicz (:maja_zf) from comment #6) > While I work on refining the prototype, I'd like to also start deploying > iterations of it to an EC2 instance to get a better idea of what's involved > there; perhaps eventually investigate Amazon SQS for durable queues. I > propose that wpt-sync operations should be managed in largely the same way > as vcs-sync/servo-sync, taking advantage of the same deployment tool chain, > running under the "Developer Services" umbrella. Yes, I think deploying this under the "Developer Services" umbrella makes sense. We already have infrastructure and IMO it doesn't make sense to duplicate efforts in a way that will lead to N+1 pieces of infrastructure. I'd very much like to strive for cohesion here. > gps, does that plan sound reasonable? If so, what would the next steps be as > far as accounts and setup? (I have an idea of what to do with terraform > configs, ansible playbooks, but where? with what credentials?) I've updated https://mozilla-version-control-tools.readthedocs.io/en/latest/vcssync/development.html with links to things. Although it appears to be serving a cached copy right now, so https://hg.mozilla.org/hgcustom/version-control-tools/file/tip/docs/vcssync/development.rst if the first section and repos doesn't load. As for access to production systems, what do you actually need to do? For security reasons, we tend to limit access to the devservices AWS accounts because everything version control tends to get classified as high risk and we try to keep the set of people with access pretty short. We can do things like create an SQS queue for you. We may be able to give you access to a temporary EC2 instance to prototype things. But when it comes time for production, we have to ratchet down access pretty tight. For experimentation, you can get pretty far by running services on your local machine with a dummy GitHub repo and account. Also, we try to be comprehensive with tests. For things that are hard to test (like Amazon SQS queues), we tend to design our interfaces such that the hard-to-test bit is abstracted away from the code that operates on that tend. For example, for Servo VCS Sync we don't have explicit test coverage that Pulse messages are handled properly. Instead, we have a standalone process we invoke when we receive a relevant message and this process is thoroughly tested. Anyway, it might be best to ping glob or me in #vcs with questions, as I imagine there could be a lot of back and forth if we do this over Bugzilla :/
Flags: needinfo?(gps)
Comment on attachment 8873181 [details] Bug 1364564 - WIP downstream sync wpt git-only https://reviewboard.mozilla.org/r/144646/#review149186 > So, in the long term it seems like we might have multiple parts of this system that all want to interact with git. Should we assume that each will maintain a separate clone, or should we consider putting access to git behind a lock? We discussed some alternatives in our meeting today. Git itself uses lock files to prevent concurrent access to repository objects, especially the index. I followed up by experimenting with `git worktree`, and I was able to operate concurrently on each linked working tree without a problem. It's resource-intensive though: for the gecko repo, the main working tree is about 5 GB whereas each linked working tree is still a hefty 1.7 GB. I also learned that you run into file locks a lot less (or not at all?) when operating on a bare git repository. So we could try to implement the sync at a lower level and actually overlay+transform the PR commits onto a branch in the gecko repository, rather than copying files from one to the other. Something to consider in the longer run.
This was done a while ago. Repo has been moved to: https://github.com/mozilla/wpt-sync The service was turned on in production today.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: