Closed
Bug 1175047
Opened 9 years ago
Closed 9 years ago
create a client that queries relengapi archiver endpoint
Categories
(Release Engineering :: Applications: MozharnessCore, defect)
Release Engineering
Applications: MozharnessCore
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: jlund, Assigned: jlund)
References
Details
Attachments
(3 files, 3 obsolete files)
10.31 KB,
patch
|
jlund
:
review+
|
Details | Diff | Splinter Review |
6.22 KB,
patch
|
dustin
:
review+
|
Details | Diff | Splinter Review |
2.43 KB,
patch
|
dustin
:
review+
jlund
:
checked-in+
|
Details | Diff | Splinter Review |
this should effectively replace all the places where we call repository_manifest.py[1] and any services we clone/pull mozharness now.
The client:
1) obtains a valid s3 url for archive via relengapi's archiver endpoint
2) downloads and extracts the archive locally
3) merges contents of extracted archive into destination, overwriting duplicate paths (this is so we do not overwrite the work dir normally inside ('mozharness/build')
4) cleans up archive and leftover extracted paths
though somewhat cumbersome, my hope is that it can serve as a 1 line script call for all places we currently deploy mozharness acting as a single source for interacting with relengapi
[1] http://hg.mozilla.org/build/tools/file/1bc622c35872/buildfarm/utils/repository_manifest.py
Assignee | ||
Comment 1•9 years ago
|
||
dustin, I've asked for r? of you simply because you are reviewing the relengapi endpoint associated with this client. Though, I can easily pass it on to someone else if you prefer. :)
Attachment #8622920 -
Flags: review?(dustin)
Comment 2•9 years ago
|
||
Comment on attachment 8622920 [details] [diff] [review]
150616_archiver_client-tools.patch
Review of attachment 8622920 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good, but I'd like to see
* errors handled with exceptions
* extract the tarball in-place
* don't read the entire tarball into memory
::: buildfarm/utils/archiver_client.py
@@ +29,5 @@
> +# When an infra error happens we want to turn purple and
> +# let sheriffs determine if re-triggering is needed
> +INFRA_CODE = 3
> +
> +EXIT_CODE = FAILURE_CODE
Unused
@@ +35,5 @@
> +ENDPOINT_CONFIG = {
> + 'mozharness': {
> + 'url_format': "archiver/{endpoint}/{rev}?repo={repo}®ion={region}&suffix={suffix}",
> + 'archive_root': "{repo}-{rev}",
> + 'archive_subdir': "testing/mozharness",
Can you explain what these mean in a comment? It's not immediately clear from the code
@@ +107,5 @@
> + "archiver logs for errors. Task status: {}".format(task_result['status']))
> + else:
> + log.error("Archiver's task could not be resolved. Check archiver logs for errors. Task "
> + "status: {}".format(task_result['status']))
> + return response
Hm, and this returns None to signal an error..
@@ +158,5 @@
> +def download_and_extract_archive(response, archive_name):
> + log.info("downloading archive: {}".format(archive_name))
> + try:
> + with open(archive_name, 'wb') as archivef:
> + archivef.write(response.read())
This loads the entire response into memory, then writes it back out. That's probably not a big deal for mozharness, which is about 60M, but could lead to an unhappy surprise if this gets re-used for a different, larger repo archive. The fix is pretty easy -- just use `shutil.copyfileobj`
In fact, since you only traverse the tarfile once (and thus don't need to seek in the file), you can actually hook it right up to Tarfile:
tar = tarfile.open(fileobj=archivef, mode='r|gz')
@@ +166,5 @@
> +
> + log.info("extracting downloaded archive: {}".format(archive_name))
> + try:
> + with tarfile.open(archive_name) as tar:
> + tar.extractall()
Why extract the entire tarball, then move it? Especially on Windows systems, creating lots of files is pretty slow, so this will waste build time.
The `tarfile` module is a little weird, and doesn't let you specify a path to extract a file to a path that does not contain the filename from the tarball. I think you could work around this with something like (assuming archive_subdir is /-terminated)
for member in tar:
if not member.name.startswith(archive_subdir):
continue
member.name = member.name.replace(archive_subdir, '')
tar.extract(member, extracted_destination)
Another option is to use `tar.makefile`, which does exactly what you want, and is documented in the source file, but not in the documentation:
https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l2220
Either one would get you the merging functionality you want (at least, as I understand the requirement)
@@ +221,5 @@
> + except IOError, e:
> + log.exception("IOError: {}".format(e))
> + return INFRA_CODE
> + except shutil.Error as e:
> + log.exception("shutil.Error: {}".format(e))
`log.exception` will include the exception after the log message, so there's no need to format it into the message as well
@@ +249,5 @@
> + response = get_url_response(url, options)
> +
> + return_code = download_and_extract_archive(response, archive_path)
> +
> + if return_code == SUCCESS_CODE:
You could avoid nesting here by inverting the check:
if return_code != SUCCESS_CODE:
return return_code
although it would be far more Pythonic to use exceptions for error handling, rather than integer error codes.
@@ +305,5 @@
> +
> + if not ENDPOINT_CONFIG.get(endpoint):
> + log.error("endpoint argument is unknown. "
> + "Given: '{}', Valid: {}".format(endpoint, str(ENDPOINT_CONFIG.keys())))
> + exit(FAILURE_CODE)
This stanza should probably move up with the "if not len(args) == 1" bit in `options_args`
Attachment #8622920 -
Flags: review?(dustin) → feedback+
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → jlund
Assignee | ||
Comment 3•9 years ago
|
||
interdiff from previous patch: http://people.mozilla.org/~jlund/150626_archiver_client-tools-interdiff.patch
addresses issues in https://bugzilla.mozilla.org/show_bug.cgi?id=1175047#c2
adheres to new endpoint in https://github.com/mozilla/build-relengapi/pull/279
Attachment #8626620 -
Flags: review?(dustin)
Comment 4•9 years ago
|
||
Comment on attachment 8626620 [details] [diff] [review]
150626_archiver_client-tools.patch
Review of attachment 8626620 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good - just a few minor bits
::: buildfarm/utils/archiver_client.py
@@ +92,5 @@
> + if task_result.get('state') == "SUCCESS":
> + if task_result.get('s3_urls'):
> + # grab a s3 url using the preferred region if available
> + s3_url = task_result['s3_urls'].get(options.region, task_result['s3_urls'].values()[0])
> + print s3_url
debug print
@@ +132,5 @@
> + log.debug("got a bad response. response code: {}".format(response.code))
> +
> + except (urllib2.HTTPError, urllib2.URLError) as e:
> + if num == options.max_retries - 1:
> + log.exception(e)
No need to even pass `e` to `log.exception` - it will use the current exception (from `sys.exc_info()`). As written, this might fail since `e` is not a string.
@@ +158,5 @@
> + # make sure our extracted root path has a trailing slash
> + extract_root = os.path.join(extract_root, '')
> +
> + try:
> + with tarfile.open(fileobj=response, mode='r|gz') as tar:
woo!
Attachment #8626620 -
Flags: review?(dustin) → review+
Assignee | ||
Comment 5•9 years ago
|
||
carrying dustin's r+
addressing comments. interdiff:
http://people.mozilla.org/~jlund/150629_archiver_client-tools-interdiff.patch
Attachment #8627339 -
Flags: review+
Assignee | ||
Updated•9 years ago
|
Attachment #8626620 -
Attachment is obsolete: true
Assignee | ||
Updated•9 years ago
|
Attachment #8622920 -
Attachment is obsolete: true
Assignee | ||
Comment 6•9 years ago
|
||
Comment on attachment 8627339 [details] [diff] [review]
150629_archiver_client-tools-final.patch
pushed to prod:
https://hg.mozilla.org/build/tools/rev/99d3763dd131
with one more line change:
diff --git a/buildfarm/utils/archiver_client.py b/buildfarm/utils/archiver_client.py
index 67fac98..b92b7da 100755
--- a/buildfarm/utils/archiver_client.py
+++ b/buildfarm/utils/archiver_client.py
@@ -132,8 +132,7 @@ def get_url_response(url, options):
except (urllib2.HTTPError, urllib2.URLError) as e:
if num == options.max_retries - 1:
- log.exception("Could not get a valid response from archiver endpoint. "
- "Status response code: {}".format(response.code))
+ log.exception("Could not get a valid response from archiver endpoint.")
exit(INFRA_CODE)
num += 1
Assignee | ||
Comment 7•9 years ago
|
||
landed on tools for a while. used along with relengapi endpoint here: https://bugzilla.mozilla.org/show_bug.cgi?id=1154795#c18
looks like this is all done for now. Thank you for all the reviews
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 8•9 years ago
|
||
turns out that this client didn't work so well in buildbot staging across all our platforms (*shocking*).
1) first issue was the python interpreter I was using is 2.6. I believe we have 2.7 across all slaves and I could have explicitly set the path but I figured there is no harm in a small script like this being 2.6 compliant and I'm sure I'll run into other factories that don't always use 2.7
a) so it wasn't a fan of str.format().
b) or my with statement for opening the archive
2) for bonus fun, os.path.join() on windows will try to use back slashes. tarfile's member paths on the other hand use posix style forward slashes. I'd imagine that's how the paths look within the tar archive so when I go to extract members that match, startswith() gets confused.
Attachment #8628993 -
Flags: review?(dustin)
Assignee | ||
Updated•9 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Updated•9 years ago
|
Attachment #8628993 -
Flags: review?(dustin) → review+
Assignee | ||
Comment 9•9 years ago
|
||
I will come back and remove some of the really loud unneeded debug lines once this is used across all trees
Status: REOPENED → RESOLVED
Closed: 9 years ago → 9 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 10•9 years ago
|
||
Attachment #8633813 -
Flags: review?(dustin)
Comment 11•9 years ago
|
||
Comment on attachment 8633813 [details] [diff] [review]
adds more debug lines to api responses and removes tar unpacking debug lines
Review of attachment 8633813 [details] [diff] [review]:
-----------------------------------------------------------------
::: buildfarm/utils/archiver_client.py
@@ +101,4 @@
> exit(FAILURE_CODE)
> else:
> log.error("Archiver's task could not be resolved. Check archiver logs for errors. Task "
> + "status: %s Task state:", task_result['status'], task_result['state'])
You removed the % subsitution here but left a %s in
Attachment #8633813 -
Flags: review?(dustin) → review-
Updated•9 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 12•9 years ago
|
||
doh, silly mistake. fixed
also truncates rev from client side to a 12 char limit as per relengapi PR comment
Attachment #8633813 -
Attachment is obsolete: true
Attachment #8634413 -
Flags: review?(dustin)
Updated•9 years ago
|
Attachment #8634413 -
Flags: review?(dustin) → review+
Assignee | ||
Updated•9 years ago
|
Attachment #8634413 -
Flags: checked-in+
Assignee | ||
Updated•9 years ago
|
Status: REOPENED → RESOLVED
Closed: 9 years ago → 9 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•