Set up an upload server for tooltool

RESOLVED FIXED

Status

Infrastructure & Operations
RelOps
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: dustin, Assigned: dustin)

Tracking

other
Dependency tree / graph
Bug Flags:
sec-review +

Details

Attachments

(1 attachment)

(Assignee)

Description

4 years ago
Uploads for tooltool (https://mana.mozilla.org/wiki/display/IT/Tooltool) will be performed via rsync over SSH.  The uploaded files will end up on the releng_web netapp volume, from where they will be installed into tooltool using a "sync" task run from cron.
(Assignee)

Comment 1

4 years ago
There will be roughly 20 people with upload access, some coming from outside of the Mozilla network.
(Assignee)

Comment 2

4 years ago
For testing, this is set up on relengwebadm.private.scl3.mozilla.com, to which relengers have access.  The path is /tooltool/uploads.  Simone, can you give it a try?
Assignee: relops → dustin
Flags: needinfo?(sbruno)
Hi Dustin,

I performed the following operations:

- I created folder /tooltool/uploads/sbruno/test on relengwebadm.private.scl3.mozilla.com server, and gave write access to sbruno user to it (actually the sbruno user is owner of /tooltool/uploads/sbruno folder)
- I put three random pdf files in folder /Users/sbruno/my_tooltool_files on my local machine and tried to upload them with the following command:
./tooltool.py distribute -v --folder /Users/sbruno/my_tooltool_files --message "This is where some useful comments are suppose to be put, but this is just a test" --user "sbruno" --host "relengwebadm.private.scl3.mozilla.com" --path "/tooltool/uploads/sbruno/test"

It worked as expected at first attempt!

The following is the output generated by the tooltool command:

Simones-MacBook-Pro:build-tooltool sbruno$ ./tooltool.py distribute -v --folder /Users/sbruno/my_tooltool_files --message "This is where some useful comments are suppose to be put, but this is just a test" --user "sbruno" --host "relengwebadm.private.scl3.mozilla.com" --path "/tooltool/uploads/sbruno/test"
DEBUG - processing 'distribute' command with args ''
DEBUG - using options: {'verbose': True, 'algorithm': 'sha512', 'package': None, 'base_url': None, 'manifest': 'manifest.tt', 'host': 'relengwebadm.private.scl3.mozilla.com', 'quiet': False, 'user': 'sbruno', 'path': '/tooltool/uploads/sbruno/test', 'folder': '/Users/sbruno/my_tooltool_files', 'size': 0.0, 'message': 'This is where some useful comments are suppose to be put, but this is just a test', 'overwrite': False, 'cache_folder': None}
INFO - Creating package /Users/sbruno/my_tooltool_files.TOOLTOOL-PACKAGE from folder /Users/sbruno/my_tooltool_files...
DEBUG - creating a new manifest file
DEBUG - adding /Users/sbruno/my_tooltool_files/file1.pdf
DEBUG - hashed /Users/sbruno/my_tooltool_files/file1.pdf with sha512 to be 630d01a329c70aedb66ae7118d12ff7dc6fe06223d1c27b793e1bacc0ca84dd469ec1a6050184f8d9c35a0636546b0e2e5be08d9b51285e53eb1c9f959fef59d
DEBUG - creating FileRecord 0x10fb39ad0
DEBUG - Added file /Users/sbruno/my_tooltool_files/file1.pdf to tooltool package /Users/sbruno/my_tooltool_files.TOOLTOOL-PACKAGE with hash 630d01a329c70aedb66ae7118d12ff7dc6fe06223d1c27b793e1bacc0ca84dd469ec1a6050184f8d9c35a0636546b0e2e5be08d9b51285e53eb1c9f959fef59d
DEBUG - appending a new file record to manifest file
DEBUG - added '/Users/sbruno/my_tooltool_files/file1.pdf' to manifest
DEBUG - adding /Users/sbruno/my_tooltool_files/file2.pdf
DEBUG - hashed /Users/sbruno/my_tooltool_files/file2.pdf with sha512 to be b2a463249bb3a9e7f2a3604697b000d2393db4f37b623fc099beb8456fbfdb332567013a3131ad138d8633cb19c50a8b77df3990d67500af896cada8b6f698b4
DEBUG - creating FileRecord 0x10fb39b50
DEBUG - Added file /Users/sbruno/my_tooltool_files/file2.pdf to tooltool package /Users/sbruno/my_tooltool_files.TOOLTOOL-PACKAGE with hash b2a463249bb3a9e7f2a3604697b000d2393db4f37b623fc099beb8456fbfdb332567013a3131ad138d8633cb19c50a8b77df3990d67500af896cada8b6f698b4
DEBUG - appending a new file record to manifest file
DEBUG - added '/Users/sbruno/my_tooltool_files/file2.pdf' to manifest
DEBUG - adding /Users/sbruno/my_tooltool_files/file3.pdf
DEBUG - hashed /Users/sbruno/my_tooltool_files/file3.pdf with sha512 to be 931eb84f798dc9add1a10c7bbd4cc85fe08efda26cac473411638d1f856865524a517209d4c7184d838ee542c8ebc9909dc64ef60f8653a681270ce23524e8e4
DEBUG - creating FileRecord 0x10fb39b90
DEBUG - Added file /Users/sbruno/my_tooltool_files/file3.pdf to tooltool package /Users/sbruno/my_tooltool_files.TOOLTOOL-PACKAGE with hash 931eb84f798dc9add1a10c7bbd4cc85fe08efda26cac473411638d1f856865524a517209d4c7184d838ee542c8ebc9909dc64ef60f8653a681270ce23524e8e4
DEBUG - appending a new file record to manifest file
DEBUG - added '/Users/sbruno/my_tooltool_files/file3.pdf' to manifest
INFO - Package /Users/sbruno/my_tooltool_files.TOOLTOOL-PACKAGE has been created from folder /Users/sbruno/my_tooltool_files
INFO - The following three rsync commands will be executed to transfer the tooltool package:
INFO - 1) rsync  -a /Users/sbruno/my_tooltool_files.TOOLTOOL-PACKAGE sbruno@relengwebadm.private.scl3.mozilla.com:/tooltool/uploads/sbruno/test --progress -f '- *.tt' -f '- *.txt'
INFO - 2) rsync  /Users/sbruno/my_tooltool_files.TOOLTOOL-PACKAGE/*.txt sbruno@relengwebadm.private.scl3.mozilla.com:/tooltool/uploads/sbruno/test --progress
INFO - 3) rsync  /Users/sbruno/my_tooltool_files.TOOLTOOL-PACKAGE/*.tt sbruno@relengwebadm.private.scl3.mozilla.com:/tooltool/uploads/sbruno/test --progress
INFO - Please note that the order of execution IS relevant!
INFO - Uploading hashed files with command: rsync  -a /Users/sbruno/my_tooltool_files.TOOLTOOL-PACKAGE sbruno@relengwebadm.private.scl3.mozilla.com:/tooltool/uploads/sbruno/test --progress -f '- *.tt' -f '- *.txt'
INFO - building file list ...
4 files to consider
INFO - my_tooltool_files.TOOLTOOL-PACKAGE/
INFO - my_tooltool_files.TOOLTOOL-PACKAGE/630d01a329c70aedb66ae7118d12ff7dc6fe06223d1c27b793e1bacc0ca84dd469ec1a6050184f8d9c35a0636546b0e2e5be08d9b51285e53eb1c9f959fef59d
     3017536 100%  379.58kB/s    0:00:07 (xfer#1, to-check=2/4)
INFO - my_tooltool_files.TOOLTOOL-PACKAGE/931eb84f798dc9add1a10c7bbd4cc85fe08efda26cac473411638d1f856865524a517209d4c7184d838ee542c8ebc9909dc64ef60f8653a681270ce23524e8e4
     3420686 100%  133.17kB/s    0:00:25 (xfer#2, to-check=1/4)
INFO - my_tooltool_files.TOOLTOOL-PACKAGE/b2a463249bb3a9e7f2a3604697b000d2393db4f37b623fc099beb8456fbfdb332567013a3131ad138d8633cb19c50a8b77df3990d67500af896cada8b6f698b4
      139308 100%   75.96kB/s    0:00:01 (xfer#3, to-check=0/4)
INFO -
INFO - sent 6578990 bytes  received 92 bytes  125315.85 bytes/sec
INFO - total size is 6577530  speedup is 1.00
INFO - Uploading metadata files (notes)    with command: rsync  /Users/sbruno/my_tooltool_files.TOOLTOOL-PACKAGE/*.txt sbruno@relengwebadm.private.scl3.mozilla.com:/tooltool/uploads/sbruno/test --progress
INFO - my_tooltool_files.txt
          81 100%    0.00kB/s    0:00:00 (xfer#1, to-check=0/1)
INFO -
INFO - sent 183 bytes  received 42 bytes  64.29 bytes/sec
INFO - total size is 81  speedup is 0.36
INFO - Uploading metadata files (manifest) with command: rsync  /Users/sbruno/my_tooltool_files.TOOLTOOL-PACKAGE/*.tt sbruno@relengwebadm.private.scl3.mozilla.com:/tooltool/uploads/sbruno/test --progress
INFO - my_tooltool_files.tt
         646 100%    0.00kB/s    0:00:00 (xfer#1, to-check=0/1)
INFO -
INFO - sent 747 bytes  received 42 bytes  225.43 bytes/sec
INFO - total size is 646  speedup is 0.82
INFO - Package /Users/sbruno/my_tooltool_files.TOOLTOOL-PACKAGE has been correctly uploaded to relengwebadm.private.scl3.mozilla.com:/tooltool/uploads/sbruno/test

I verified that the all files have been correctly transferred to the upload server, in the proper location, and with the correct filenames; the manifest file has also been generated correctly, as well as the comment file. You could actually already use my upload to run the first tests on the sync script!
Flags: needinfo?(sbruno)
(Assignee)

Comment 4

4 years ago
Awesome!  I'd rather you didn't have to create the directory in advance, though.  Can the rsync command do so, or do we need to have puppet do it for each allowed user?
Hi Dustin,

I 'd prefer the puppet solution in order to keep full control on the upload folders layout and prevent users from uploading to folders not corresponding to any supported distribution types (which will probably be just "pub" and "pvt").
Dustin, were you able to run the sync script using the test upload I made?
Flags: needinfo?(dustin)
(Assignee)

Comment 7

4 years ago
It should be running on a cronjob, and thus have run many times by now.

I see logs of that occurring:

Oct 29 07:40:01 relengwebadm CROND[10316]: (root) CMD (cd /data/releng/src/tooltool && /usr/bin/python2.7 tooltool/sync.py)

tooltool_sync.log is empty.  The script runs and exits with a status of 0.
Flags: needinfo?(dustin)
The reason why nothing happened so far is that the sync script was (correctly!) configured to support "pvt" and "pub" distributions, while I uploaded my tooltool package to a "test" folder.
By the way, incidentally, this is exactly the scenario I was referring to in Comment 5 (user uploading to a non supported folder) when I expressed my preference for a centralized setup of the upload directories layout. In fact, this happened because as an uploader I had the right to create a test folder. If I hadn't had the privilege to do so, the attempt to upload would have failed.

I run an upload to my pvt folder, and this time the sync script correctly processed it, notifying me with an appropriate message (I received even if my user is not in the email_mapping in the config file, since in this case a default user@mozilla.com is attempted).

I had a look to the log file, and I found a minor mistake while resolving a %s in a string - I'll fix that now.

Thanks Dustin for the machine setup!
I just run an upload of an intentionally corrupted tooltool package, and sync.py worked as expected in this scenario as well:
- I received an appropriate email notifying the upload failure and the reason for it
- The upload folder has been cleaned up (manifest, notes and artifatcs have been removed)
- Everything has been logged
(Assignee)

Comment 10

4 years ago
Sweet!  I'll rework the puppet a bit to create those directories, unless you want to take a whack at it.
Thanks a lot Dustin! I think you'll be quicker in applying the puppet changes than I could possibly be :-)

A brief status summary:

The upload server has been setup correctly, and the sync script seems to be working as expected.

Next steps:

- setup the actual prod upload server (if I understood correctly this is just a test one)
- identify the list of uploaders and create an appropriate directory structure, with a "pvt" and a "pub" sub-folder for each uploader

Finally, all artifacts will need to be rsync'd to the actual apache servers (which could be more than one per distribution type for resliency purposes) with a dedicated crontab entry.

rail: are you able to provide a reasonable list of tooltool uploaders to start with?

Updated

4 years ago
Flags: needinfo?(rail)
let's start with releng people?
Flags: needinfo?(rail)
Ok, let's start just with releng people and extend the uploaders list later.
Thanks!
(Assignee)

Comment 14

4 years ago
OK, puppte changes are in:

> 279300    4 drwxr-xr-x   4 root     root         4096 Nov  1 11:51 /mnt/netapp/relengweb/tooltool/uploads/nthomas
> 419769    4 drwxr-xr-x   2 nthomas  root         4096 Nov  1 11:51 /mnt/netapp/relengweb/tooltool/uploads/nthomas/pub
> 419770    4 drwxr-xr-x   2 nthomas  root         4096 Nov  1 11:51 /mnt/netapp/relengweb/tooltool/uploads/nthomas/pvt
> 279301    4 drwxr-xr-x   4 root     root         4096 Nov  1 11:51 /mnt/netapp/relengweb/tooltool/uploads/coop
> 279318    4 drwxr-xr-x   2 coop     root         4096 Nov  1 11:51 /mnt/netapp/relengweb/tooltool/uploads/coop/pub
> 233583    4 drwxr-xr-x   2 coop     root         4096 Nov  1 11:51 /mnt/netapp/relengweb/tooltool/uploads/coop/pvt

etc.

I'm waiting to hear back from joes on whether we should set this up permanently on the upload servers, or set up purpose-specific hosts for it.
(Assignee)

Comment 15

4 years ago
From talking to :tinfoil, it sounds like we should do a second set of boxes to handle this.  Those can be limited to access only by designated people, and hopefully limited to just using rsync.  I'll loop back with :tinfoil when I've got something set up to see who wants to do a secreview of the implementation.
(Assignee)

Comment 16

4 years ago
This isn't a 24/7 critical service, and will be used infrequently, so we can get away with only one host, built from puppet of course.  If it goes down in flames, we'll just build another.
(Assignee)

Updated

4 years ago
Depends on: 945420
(Assignee)

Comment 17

4 years ago
So my plan is this:

Make an LDAP group granting access to this system.

For members of that group, via sshd_config:
 - chroot /chroot, with proper symlinks and mounts from there
 - no tcp or x11 forwarding
 - force command rsync

For others who are not members of wheel:
 - refuse login
 (this covers any intermediate states when an account still exists on the host but has been removed from the group)

I need some help getting the LDAP group linked into puppet before I can implement.  But does this sound reasonable at a high level?
(Assignee)

Updated

4 years ago
Depends on: 948457
(Assignee)

Comment 18

4 years ago
The secreview for this implementation should include consideration of the implications of adding an interface to /vol/releng_web to the dmz.  The other option would be a volume dedicated to tooltool.
(Assignee)

Comment 19

4 years ago
OK, this is ready to go.  The relevant puppet code is in modules/tooltool_uploads.  It turned out roughly as I'd planned, but it was a bit more difficult than I thought!
Flags: sec-review?
(Assignee)

Comment 20

4 years ago
"go" = security review

Updated

4 years ago
Flags: sec-review? → sec-review?(jvehent)
opsec meeting scheduled for 12/16 to discuss this
(Assignee)

Updated

4 years ago
Depends on: 950697
The puppet module looks good from a security standpoint. r+
Flags: sec-review?(jvehent) → sec-review+

Updated

4 years ago
See Also: → bug 884347
(Assignee)

Comment 23

4 years ago
We're using a different mountpoint for uploads now, to isolate access from the uploads host to anything else on the relengweb volume.  I just finished that bit up, and the crontask now moves files from the tooltool_uploads volume to the relengweb volume.
(Assignee)

Comment 24

4 years ago
simone: this is up now, at tooltool-uploads.pub.build.mozilla.org.

Can you give it a shot (outside any VPNs) and verify that uploads appear?
Flags: needinfo?(sbruno)
Created attachment 8349284 [details]
Output of tooltool upload on tooltool-uploads.pub.build.mozilla.org

Hi Dustin,

I run an upload to tooltool-uploads.pub.build.mozilla.org as follows:

./tooltool.py distribute -v --folder /Users/sbruno/my_tooltool_files --message "blabla" --user "sbruno" --host tooltool-uploads.pub.build.mozilla.org --path "/tooltool/uploads/sbruno/pub"

As you can see in the attached output file, the command executed successfully. 

Furthermore, I received a success notification email:

TOOLTOOL UPLOAD COMPLETED! Tooltool package my_tooltool_files9.tt has been correctly processed by the tooltool sync script!

So apparently the sync script also executed successfully.

What is the target folder you configured for the sync script for the "pub" distribution type? Are the files already propagated from that target folder to some tooltool server? I did not see anything appearing on http://tooltool.pub.build.mozilla.org/try/
Flags: needinfo?(sbruno)
(Assignee)

Comment 26

4 years ago
Ah, it looks like those are uploading to the root, which is not what we want:

[root@relengwebadm.private.scl3 dmitchell]# cd /data/releng/src/tooltool/
[root@relengwebadm.private.scl3 tooltool]# cat config.json
{
    "upload_root": "/mnt/netapp/tooltool_uploads",
    "target_folders": {"pvt": "/mnt/netapp/relengweb/tooltool/pvt",
                       "pub": "/mnt/netapp/relengweb/tooltool/pub"},
    "smtp_server": "localhost",
    "smtp_port": 25,
    "user_email_mapping": {"dmitchell": "dustin@mozilla.com"},
    "default_domain": "mozilla.com",
    "smtp_from": "no-reply@mozilla.org"
}
[root@relengwebadm.private.scl3 tooltool]# ls /mnt/netapp/relengweb/tooltool/{pvt,pub}/
/mnt/netapp/relengweb/tooltool/pub/:
630d01a329c70aedb66ae7118d12ff7dc6fe06223d1c27b793e1bacc0ca84dd469ec1a6050184f8d9c35a0636546b0e2e5be08d9b51285e53eb1c9f959fef59d  temp-sm-stuff
931eb84f798dc9add1a10c7bbd4cc85fe08efda26cac473411638d1f856865524a517209d4c7184d838ee542c8ebc9909dc64ef60f8653a681270ce23524e8e4  try
b2a463249bb3a9e7f2a3604697b000d2393db4f37b623fc099beb8456fbfdb332567013a3131ad138d8633cb19c50a8b77df3990d67500af896cada8b6f698b4

/mnt/netapp/relengweb/tooltool/pvt/:
630d01a329c70aedb66ae7118d12ff7dc6fe06223d1c27b793e1bacc0ca84dd469ec1a6050184f8d9c35a0636546b0e2e5be08d9b51285e53eb1c9f959fef59d
931eb84f798dc9add1a10c7bbd4cc85fe08efda26cac473411638d1f856865524a517209d4c7184d838ee542c8ebc9909dc64ef60f8653a681270ce23524e8e4
b2a463249bb3a9e7f2a3604697b000d2393db4f37b623fc099beb8456fbfdb332567013a3131ad138d8633cb19c50a8b77df3990d67500af896cada8b6f698b4
build


Maybe we should have target folders like this instead:

    "target_folders": {"pvt-build": "/mnt/netapp/relengweb/tooltool/pvt/build",
                       "pub-try":   "/mnt/netapp/relengweb/tooltool/pub/try"},

Or maybe we should just serve files at the root of the vhosts, rather than adding these extra directory components (build and try)?  Will we ever need those?

The script is also copying manifests to the directory where the script runs.  How can I configure where that goes?


[root@relengwebadm.private.scl3 tooltool]# ls
630d01a329c70aedb66ae7118d12ff7dc6fe06223d1c27b793e1bacc0ca84dd469ec1a6050184f8d9c35a0636546b0e2e5be08d9b51285e53eb1c9f959fef59d.MANIFESTS  sbruno.pub.2013_12_18-01.25.01.my_tooltool_files9.txt
931eb84f798dc9add1a10c7bbd4cc85fe08efda26cac473411638d1f856865524a517209d4c7184d838ee542c8ebc9909dc64ef60f8653a681270ce23524e8e4.MANIFESTS  sbruno.pub.2013_12_18-01.25.02.my_tooltool_files10.tt
b2a463249bb3a9e7f2a3604697b000d2393db4f37b623fc099beb8456fbfdb332567013a3131ad138d8633cb19c50a8b77df3990d67500af896cada8b6f698b4.MANIFESTS  sbruno.pub.2013_12_18-01.25.02.my_tooltool_files10.txt
config.json                                                                                                                                 tooltool
README-webapp.txt                                                                                                                           tooltool_sync.log
sbruno.pub.2013_12_18-01.25.01.my_tooltool_files9.tt
Hi Dustin!

- The idea is for the sync script to transfer files to the intermediate target folders, which are supposed to be the authoritative source of information about what files should be available on tooltool servers for each distribution type. This is why one single target folder per distribution type is currently supported. Target folders' name is arbitrary, so the names you propose are ok.

- As per the "abuse" of the directory running the sync script itself, I put an extra comment on Bug 930568. The location where they are stored at the moment is not configurable.
(Assignee)

Comment 28

4 years ago
I think this is done from my perspective -- at least, I don't have any action to take here.
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Dustin, the sync script is still configured to copy files from the users uploads folders to the following locations:

"target_folders": {"pvt": "/mnt/netapp/relengweb/tooltool/pvt",
                   "pub": "/mnt/netapp/relengweb/tooltool/pub"}

The folders currently served by apache, though, are:
/mnt/netapp/relengweb/tooltool/pvt/build/sha512 (for private stuff)
/mnt/netapp/relengweb/tooltool/pub/temp-sm-stuff/sha512/ (for public stuff)

Consequently, even if the sync script is working, successful uploads do not translate into availability of artifacts in the download servers.

What I would do is the following:
- enable a crontabbed rsync command to keep the target folders of the sync script aligned with all the folders served by apache.

You may reasonably object that it would be simpler to just set the folders used by apache as target folders in the sync script configuration.
This would work until we have a 1-2-1 matching like this:
/mnt/netapp/relengweb/tooltool/pvt -> /mnt/netapp/relengweb/tooltool/pvt/build/sha512
/mnt/netapp/relengweb/tooltool/pub -> /mnt/netapp/relengweb/tooltool/pub/temp-sm-stuff/sha512/ 

The intermediate rsync would allow to support generic setups with multiple servers for resiliency purposes, as follows:

/mnt/netapp/relengweb/tooltool/pvt
                                   rsync to -> folder1
                                   rsync to -> folder2 for resiliency
                                   rsync to -> folder3 for resiliency
/mnt/netapp/relengweb/tooltool/pub
                                   rsync to -> folder1
                                   rsync to -> folder2 for resiliency

Another advantage of such a setup is that the target folder of sync script would be the authoritative source of truth regarding what the content needs to be (which is needed whenever there are synchronization tasks).

What do you think?
Status: RESOLVED → REOPENED
Flags: needinfo?(dustin)
Resolution: FIXED → ---
(Assignee)

Comment 30

4 years ago
I changed config.json to include the build/sha512 and temp-sm-stuff/sha512 directories.  We can figure out mirroring later, and I don't see this change making mirroring harder.
Flags: needinfo?(dustin)

Updated

4 years ago
Status: REOPENED → RESOLVED
Last Resolved: 4 years ago4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.