<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Reporter

Comment 3

•

8 years ago

(In reply to Jonas Finnemann Jensen (:jonasfj) from comment #2) > But compression is a storage vs. cpu cycle cost. If we were talking about archival cost only, yes. But there's the network transfer for the multiple consumers of the file to consider.

Comment 4

•

8 years ago

(In reply to Greg Arndt [:garndt] from comment #1) > The worker makes no assumption about the compression to use when uploading > files that a task creates. We could probably just assume that all > plain/text artifact should be gzipped. What about all files, regardless of mime type?

Reporter

Updated

•

8 years ago

Blocks: 1295997

Comment 5

•

8 years ago

I had a chat with Ed on IRC here: http://logs.glob.uno/?c=ateam#c1107025 09:37 pmoore emorley: jonasfj: do you see any reason why we shouldn't use gzip content encoding for *all* artifacts (not just text/plain ones)? 09:39 pmoore .... we could have a sanity check that if the compressed version is larger than the uncompressed version, we don't - but maybe that never even happens 09:41 pmoore .... i also can't imagine that compression / decompression costs a lot in wall-time - and if it does, probably it is because of enormous artifacts which would then be served more quickly as a result ... 09:41 emorley pmoore: for a point of reference, whitenoise takes the "blacklist rather than whitelist" and "check compression ratio <95%" approach github.com/evansd/whitenoise/blob/…88a/whitenoise/compress.py#L21-L29 09:42 emorley and logs cases where files were skipped due to compression ratio not being good enough 09:42 emorley (where those logs then allow for figuring out which cases to add to the blacklist) 09:42 pmoore ah, nice 09:42 pmoore that seems like a sane approach to me 10:03 pmoore emorley: in the absence of any objections, i'll implement like this in generic-worker - the same blacklist, and the same <95% compression ratio requirement 10:04 emorley sgtm

Comment 6

•

8 years ago

It's worth noting that some clients (curl, notably) do not automatically decode HTTP responses, so this may break a lot of things.

Comment 7

•

8 years ago

urllib2 being another example

Reporter

Comment 8

•

8 years ago

Good point, we'll need to add `--compressed` to the relevant curl invocations. (In reply to Dustin J. Mitchell [:dustin] from comment #7) > urllib2 being another example Another reason for switching to requests :-)

Comment 9

•

8 years ago

There are probably hundreds of those across lots of projects using TC, so I'm not sure we could do so effectively.

Reporter

Comment 10

•

8 years ago

Given that, the simpler alternative might be to do compression on demand: (In reply to Ed Morley [:emorley] from comment #0) > Whilst I know in some places we compress before upload to S3, Cloudfront can > automatically use gzip too: > https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/ > ServingCompressedFiles.html > https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/ > ServingCompressedFiles.html#compressed-content-cloudfront-file-types

[github robot]

Comment 11

•

8 years ago

Commits pushed to master at https://github.com/taskcluster/generic-worker https://github.com/taskcluster/generic-worker/commit/87ab14f1770706cd95eef630a351ecaaa96b828b Bug 1347956 - blacklist certain file extensions from useing gzip content encoding https://github.com/taskcluster/generic-worker/commit/f4381839feb46103e1e3eef64172425ea5e29b2e Merge pull request #49 from taskcluster/gzip-more-artifacts Bug 1347956 - gzip more artifacts

Comment 12

•

8 years ago

I've added the blacklist in generic-worker, which should get rolled out to all worker types (OS X, Windows) in the next days. We'll still need to do this in taskcluster-worker and docker-worker (and maybe some releng workers too - buildbot bridge / signing worker / script worker etc). At the moment I don't check for at least 95% compression - I've just implemented the part with the blacklist, and gzip compress everything else.

Comment 13

•

8 years ago

Just seen the comments between Ed and Dustin. To be honest, since we are already gzipping some artifacts, and not others, I expect that anything reading artifacts probably is already handling this. Mostly I guess people will fetch logs, which already had gzip encoding. I also think, the first time they run something and it pulls back binary content, they will quickly work out it is compressed. So I'm hoping this change won't break too many things.

Comment 14

•

8 years ago

Another angle on this, is I think content-encoding is not an obscure thing, it is a standard header, that clients should respect - so if it was something less obscure, I would tend to agree, but I think it is reasonable to expect clients to respect the well-defined semantics of the Content-Encoding header. I say, let's roll with it, and back stuff out if it causes pain, but not let the worry that something somewhere might be affected prevent us from making progress. I think a disruptive approach has the potential to have a more positive impact.

Updated

•

8 years ago

Depends on: 1358142

Comment 15

•

8 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #7) > urllib2 being another example Funny you should mention that! I discovered mozharness is using urllib2 and therefore doesn't support Content-Encoding when downloading artifacts - so I'm adding a fix for that in bug 1358142.

Reporter

Comment 16

•

8 years ago

Hi Pete! Do you know if the changes here have been deployed? I'd somewhat forgotten about this bug since I thought it had been deployed, but recent slow transaction traces for Treeherder's store-failure-lines task showed 89% of the profile fetching in requests fetching from eg: https://queue.taskcluster.net/v1/task/VQaPi7msQIGTK9HF1PnPcw/runs/0/artifacts/public/test_info//reftest-no-accel_errorsummary.log ...which isn't served gzipped: HTTP/1.1 200 OK Content-Type: text/plain Content-Length: 3479668 Connection: keep-alive Date: Wed, 07 Jun 2017 10:10:44 GMT Last-Modified: Wed, 07 Jun 2017 09:35:50 GMT ETag: "0a41399ab3767f1b27799b47f50277cb" x-amz-version-id: TiK42srJM9OeCa6AFuZ6VASC6HgkRQ5l Accept-Ranges: bytes Server: AmazonS3 X-Cache: Miss from cloudfront Via: 1.1 c771900addaa417be1d0b79ff157a3f9.cloudfront.net (CloudFront) X-Amz-Cf-Id: tOhx8tonD9be2JSFd9pjgYqtk_QnN-F8IWNSqyiVxjeAP-z8cwRmuw==

Flags: needinfo?(pmoore)

Greg Arndt [:garndt]

Comment 17

•

8 years ago

Changes for this have been deployed for generic-worker, but not docker-worker (which ran the task linked to). I thought there might have been a discussion if this is behavior we wanted everywhere but I cannot find a link to it. Perhaps Pete recalls that discussion.

Comment 18

•

8 years ago

I think that we haven't discussed docker-worker yet, but my opinion is that it would make sense to implement in docker-worker, if it isn't a huge piece of work. Otherwise we should certainly make sure we have it in taskcluster-worker. @Jonas, how far off replacing docker-worker with taskcluster-worker are we, at a guess (6 months/1 year/...)? Do we still intend to replace it? My thinking here is, if we are close to replacing, maybe better just to roll out in taskcluster-worker. @Ed, Do you have an idea how much cost this might save overall, or if it would have a big performance impact? My guess is, including picking up the pieces if they break, we're talking a couple of days for making the fix(es), testing and reviewing them, and then rolling out to all our worker types, taking care of bustage etc. Maybe 3 days in total, if things don't go smoothly?

Flags: needinfo?(pmoore)

Flags: needinfo?(jopsen)

Flags: needinfo?(emorley)

Reporter

Comment 19

•

8 years ago

Thank you for the further details. Just to check I understand, we have: * generic-worker -> gzip support added in this bug (comment 11) * docker-worker -> doesn't yet use gzip, but may be replaced soon by taskcluster-worker but timeline unclear * taskcluster-worker -> appears to already gzip (added by bug 1305707), is already used for certain tasks (and might be used for more if it replaces docker-worker) (In reply to Pete Moore [:pmoore][:pete] from comment #18) > Do you have an idea how much cost this might save overall, or if it would > have a big performance impact? From Treeherder's point of view, the concern is not the cost (we don't pay for the S3 bucket or data transfer) but about reducing the proportion of the time spent fetching the json error summary files, particularly when S3 is being slow. Looking at the last 7 days, the store-failure-lines task profile shows 81% of the profile spent in the request to queue.taskcluster.net (though some of that is due to things like bug 1348071 / bug 1348072). I'd imagine there are cost savings (storage + transfer) to be had for the Taskcluster team - for example this file as served is 236 MB but only 9.5 MB using default CLI gzip options: https://queue.taskcluster.net/v1/task/DBYbDF8RR8CrQL9tREtDaA/runs/0/artifacts/public/test_info/reftest-no-accel_errorsummary.log > My guess is, including picking up the pieces > if they break, we're talking a couple of days for making the fix(es), > testing and reviewing them, and then rolling out to all our worker types, > taking care of bustage etc. Maybe 3 days in total, if things don't go > smoothly? Yeah I agree it's not going to be as straight forwards as it might seem. I'm happy to wait until docker-worker is replaced if that's not years away. My main concern is that lack of gzipping is something that seems to continually bite us. It would be great to have it considered as an essential feature the next time a new worker project is created, so we don't need to try and add it retrospectively :-)

Flags: needinfo?(emorley)

Jonas Finnemann Jensen (:jonasfj)

Comment 20

•

8 years ago

(In reply to Ed Morley [:emorley] from comment #19) > Thank you for the further details. > > Just to check I understand, we have: > * generic-worker -> gzip support added in this bug (comment 11) Yes, and that landed in generic-worker 8.3.0. That means the fix is currently running on our Windows builders (win2012), and our Windows 10 machines, but not yet on Windows 7 testers. > * docker-worker -> doesn't yet use gzip, but may be replaced soon by > taskcluster-worker but timeline unclear I believe it uses it for log files, but not for other artifacts. garndt can confirm. > * taskcluster-worker -> appears to already gzip (added by bug 1305707), is > already used for certain tasks (and might be used for more if it replaces > docker-worker) I believe that is only for the log file. Currently taskcluster-worker isn't used for anything yet in production, as far as I know. > > (In reply to Pete Moore [:pmoore][:pete] from comment #18) > > Do you have an idea how much cost this might save overall, or if it would > > have a big performance impact? > > From Treeherder's point of view, the concern is not the cost (we don't pay > for the S3 bucket or data transfer) but about reducing the proportion of the > time spent fetching the json error summary files, particularly when S3 is > being slow. Looking at the last 7 days, the store-failure-lines task profile > shows 81% of the profile spent in the request to queue.taskcluster.net > (though some of that is due to things like bug 1348071 / bug 1348072). > > I'd imagine there are cost savings (storage + transfer) to be had for the > Taskcluster team - for example this file as served is 236 MB but only 9.5 MB > using default CLI gzip options: > https://queue.taskcluster.net/v1/task/DBYbDF8RR8CrQL9tREtDaA/runs/0/ > artifacts/public/test_info/reftest-no-accel_errorsummary.log > > > My guess is, including picking up the pieces > > if they break, we're talking a couple of days for making the fix(es), > > testing and reviewing them, and then rolling out to all our worker types, > > taking care of bustage etc. Maybe 3 days in total, if things don't go > > smoothly? > > Yeah I agree it's not going to be as straight forwards as it might seem. I'm > happy to wait until docker-worker is replaced if that's not years away. > > My main concern is that lack of gzipping is something that seems to > continually bite us. It would be great to have it considered as an essential > feature the next time a new worker project is created, so we don't need to > try and add it retrospectively :-) Hindsight is a blessing, hey! :) But yes, indeed.... Thanks for the clarifications. Note, I hope to roll out a newer generic worker to the windows 7 machines over the next week if I don't find any major problems (I can't promise), in which case then all generic worker workers will be using gzip compressed artifacts (when artifact is not blacklisted as already being compressed).

Comment 21

•

8 years ago

tc-worker won't be replacing docker-worker anytime soon, we're likely focusing on other uses first. Note, we should probably declare somewhere that some artifacts are gzipped others aren't, and that consumers of the API is required to support Content-Encoding: gzip.

Flags: needinfo?(jopsen)

Chris Cooper [:coop] (he/him)

Comment 22

•

8 years ago

(In reply to Jonas Finnemann Jensen (:jonasfj) from comment #21) > Note, we should probably declare somewhere that some artifacts are gzipped > others aren't, and that consumers of the API is required to support > Content-Encoding: gzip. That's in the new manual.

Comment 23

•

7 years ago

Found in triage. (In reply to Dustin J. Mitchell [:dustin] pronoun: he from comment #22) > (In reply to Jonas Finnemann Jensen (:jonasfj) from comment #21) > > Note, we should probably declare somewhere that some artifacts are gzipped > > others aren't, and that consumers of the API is required to support > > Content-Encoding: gzip. > > That's in the new manual. Is documentation enough to close this out in this case?

Flags: needinfo?(dustin)

Comment 24

•

7 years ago

Sure?

Status: NEW → RESOLVED

Closed: 7 years ago

Flags: needinfo?(dustin)

Resolution: --- → FIXED