Last Comment Bug 1160929 - add support for storing attachments in s3
: add support for storing attachments in s3
Status: RESOLVED FIXED
: bmo-goal
Product: bugzilla.mozilla.org
Classification: Other
Component: General (show other bugs)
: Production
: Unspecified Unspecified
-- normal (vote)
: ---
Assigned To: Byron Jones ‹:glob›
:
:
Mentors:
Depends on: 1180570 1180571 1180572
Blocks: 1309706 1324489 1071199
  Show dependency treegraph
 
Reported: 2015-05-03 23:22 PDT by Byron Jones ‹:glob›
Modified: 2016-12-19 09:41 PST (History)
5 users (show)
See Also:
Due Date:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
1160929_1.patch (56.36 KB, patch)
2015-08-03 01:45 PDT, Byron Jones ‹:glob›
dkl: review+
Details | Diff | Splinter Review

Description User image Byron Jones ‹:glob› 2015-05-03 23:22:30 PDT
- remove maxlocalattachment parameter (it's confusing and pointless)
- add "attachment_storage" parameter, "database, filesystem, or s3"
  - database = attach_data 
  - filesystem = data/attachments directory (all files)
  - s3 = amazon s3 storage
- increase the filesystem bucket count from 100 to 1000
  - we have over one million attachments, that's 10000 files with 100 buckets
- add parameters for s3 configuration
- remove attachment data searching support
  - very low usage (0.004% of all queries)
- implement s3 support in bugzilla::attachment
  - need to support creation and deletion
  - implement deletion of filesystem stored attachments (set the file length to 0 bytes instead of unlinking)
- migrate existing attachments out of the database
  - bugzilla's settings only applies to new attachments, we need migration code
  - implement as a checksetup switch instead of a separate script
Comment 1 User image Reed Loden [:reed] (use needinfo?) 2015-05-03 23:43:46 PDT
(In reply to Byron Jones ‹:glob› from comment #0)
> - remove attachment data searching support
>   - very low usage (0.004% of all queries)

Pretty sure the security team relies on this for bug bounty stuff. Might want to check with them.
Comment 2 User image Byron Jones ‹:glob› 2015-05-04 06:47:03 PDT
(In reply to Reed Loden [:reed] (use needinfo?) from comment #1)
> Pretty sure the security team relies on this for bug bounty stuff. Might
> want to check with them.

dveditz, are you aware of any security processes that require searching attachment _content_ on bugzilla?
if so we'll have to find alternative ways to get the required data to you.

(i couldn't see any searches from your team in the logs when i looked at all of the data for march).
Comment 3 User image Daniel Veditz [:dveditz] 2015-05-04 14:34:38 PDT
I'm not aware of searching for content, but we do occasionally search for attachment _description_. Once we get the Hall of Fame pages up we'll be using that even more (but again, not attachment content). We used to rely on these searches more but less so since adding the sec-bounty flag.
Comment 4 User image Byron Jones ‹:glob› 2015-05-04 21:04:12 PDT
thanks dveditz :)


another change we'll need:
- determine attachment size when uploaded, and store it in a new column in the attachments table
  - currently it's generated upon request, either by length(thedata) or -s $filename
  - checksetup migration required to populate the column
Comment 5 User image Reed Loden [:reed] (use needinfo?) 2015-07-16 22:19:25 PDT
Make sure you encrypt files at rest in S3. That's not done by default. You may also want to use AWS KMS for encryption as well.
Comment 6 User image Byron Jones ‹:glob› 2015-07-19 21:55:57 PDT
(In reply to Reed Loden [:reed] (use needinfo?) from comment #5)
> Make sure you encrypt files at rest in S3. That's not done by default. You
> may also want to use AWS KMS for encryption as well.

both good points; i'll add them to the aws deployment bug.
Comment 7 User image Byron Jones ‹:glob› 2015-08-03 01:45:29 PDT
Created attachment 8642300 [details] [diff] [review]
1160929_1.patch

- adds Bugzilla::Attachment::S3
- adds params
- adds way for a config class to check inter-dependant params
- tweaks to scripts/migrate-attachments.pl
- inlines Amazon::S3
  - significantly lighter than Net::Amazon::S3
  - Amazon::S3 appears to be abandoned, our copy incorporates a number of fixes


ping me on irc when you're ready to test .. i'll set up an s3 bucket for you under the ateam account.
Comment 8 User image David Lawrence [:dkl] 2015-08-10 16:07:12 PDT
Comment on attachment 8642300 [details] [diff] [review]
1160929_1.patch

Review of attachment 8642300 [details] [diff] [review]:
-----------------------------------------------------------------

Comments:

1. Nit; If we are only going to allow adding/removing buckets from the AWS console, and not from within Bugzilla, then I would just remove all of the methods dealing with
listing, adding, and removing of buckets in Bugzilla::S3. Less code to drag along and maintain. 

2. Created my own AWS account with S3 bucket. 

3. I was able to upload, download, and remove attachments of different types using enter_bug.cgi (standard and guided) and attachment.cgi (add, view and delete).
All operations completed successfully and meta data looked correct in AWS console. 

r=dkl

::: Bugzilla/Attachment/S3.pm
@@ +49,5 @@
> +}
> +
> +sub exists {
> +    my ($self, $attach_id) = @_;
> +    return !!$self->{bucket}->head_key($attach_id);

til; Never used that notation before but now realize it is basically the same as undef ? 1 : 0; so it forces a boolean context :)

::: Bugzilla/S3.pm
@@ +505,5 @@
> +
> +    return $buf;
> +}
> +
> +sub _trim {

Nit: could just use Bugzilla::Util::trim
Comment 9 User image Byron Jones ‹:glob› 2015-08-11 08:18:01 PDT
- removed unused bucket management and related docs
- fixed trim and proxy configuration

To ssh://gitolite3@git.mozilla.org/webtools/bmo/bugzilla.git
   dd7cd13..99bec2e  master -> master

Note You need to log in before you can comment on or make changes to this bug.