1126369 - move crash-analysis.m.c/rkaiser/ to EC2

Assignee

Description

•

11 years ago

We currently have a crash-analysis server that hosts KaiRo's reports, written in PHP that live in this repo: http://hg.mozilla.org/users/kairo_kairo.at/crash-report-tools/ Let's move this to github and auto-deploy it to Heroku.

Robert Helmer [:rhelmer]

Assignee

Updated

•

11 years ago

Depends on: 1126626

Robert Helmer [:rhelmer]

Assignee

Updated

•

11 years ago

Status: NEW → ASSIGNED

Daniel Maher [:phrawzty]

Comment 2

•

11 years ago

Out of curiosity, why Heroku in specific ?

Robert Helmer [:rhelmer]

Assignee

Comment 3

•

11 years ago

(In reply to Daniel Maher [:phrawzty] from comment #2) > Out of curiosity, why Heroku in specific ? Just one less thing for us to manage. It's trivial to auto-deploy to Heroku from travis, and I've already tested that these PHP scripts (which are intended to run from cron not web-facing) work find under the Heroku PHP buildpack. Since Heroku runs on AWS it has fast and free access to S3 and our hosted services (it needs Postgres too) If we end up having to do this ourselves, it'd require either a crontabber job or a cron job somewhere, PHP installed in our base image, and we'd have to figure out how to deploy it. All solvable, just a bit of extra work I'd like to avoid if we can.

Robert Helmer [:rhelmer]

Assignee

Updated

•

11 years ago

Depends on: 1126885

Robert Helmer [:rhelmer]

Assignee

Comment 4

•

11 years ago

OK so I got this working on Heroku, and converted most of the scripts over to use S3 instead of local filesystem: https://github.com/rhelmer/crash-report-tools/compare/automate?expand=1 However I think this is too much to test in time, in the short-term it'd be better to run this on EC2 and sync files from S3 before the scripts run, let them run on local files, and then sync the output over to S3.

Summary: move crash-analysis.m.c/rkaiser/ to Heroku → move crash-analysis.m.c/rkaiser/ to EC2

Robert Helmer [:rhelmer]

Assignee

Comment 5

•

10 years ago

We're backing off the approach in comment 4 a bit, going to have a long-running instance which can run things the old-fashioned way, and expose the same mount points and web service as the old server. I've installed some packages which we'll need to get into the socorro-infra repo, for rebuilding this box in the future: mercurial, php-xml, pgp-pgsql, nano Also - I am copying /mnt/crashanalysis/rkaiser/ from the old crash-analysis server. /mnt/crashanalysis should be served statically by nginx on the new box.

Robert Kaiser

Comment 6

•

10 years ago

(In reply to Robert Helmer [:rhelmer] from comment #5) > I've installed some packages which we'll need to get into the socorro-infra > repo, for rebuilding this box in the future: mercurial, php-xml, pgp-pgsql, > nano FTR, the second-to-last package there is php-pgsql (just a typo Rob made writing it down here).

Daniel Maher [:phrawzty]

Comment 7

•

10 years ago

(In reply to Robert Helmer [:rhelmer] from comment #5) > I've installed some packages which we'll need to get into the socorro-infra > repo, for rebuilding this box in the future: mercurial, php-xml, pgp-pgsql, > nano https://github.com/mozilla/socorro-infra/pull/169

Robert Helmer [:rhelmer]

Assignee

Comment 8

•

10 years ago

I've mounted a 500 GB EBS volume on /mnt/crashanalysis, and sync'd it with the crashstorage S3 bucket (which was populated from the original crash-analysis server) Kairo, you should find the expected data files in /mnt/crashanalysis/rkaiser (and it should be fast enough now), and you should be able to browse this at https://elb-prod-socorroanalysis-1689424753.us-west-2.elb.amazonaws.com/rkaiser/ - can you confirm that this looks OK and everything is working for you?

Robert Helmer [:rhelmer]

Assignee

Comment 9

•

10 years ago

(In reply to Robert Helmer [:rhelmer] from comment #8) > I've mounted a 500 GB EBS volume on /mnt/crashanalysis, and sync'd it with > the crashstorage S3 bucket (which was populated from the original > crash-analysis server) We'll need to get this new volume into terraform ^ I think we also want to do at least a nightly backup of this volume to the public crash-analysis S3 bucket.

Robert Kaiser

Comment 10

•

10 years ago

The current setup for this box seems to be working now. I'm waiting for tomorrow with closing this bug as I want to confirm that the cron job works as well.

Robert Kaiser

Comment 11

•

10 years ago

The cron ran fine today, so I'll mark this bug fixed. Rob, do we have open bugs on 1) investigating why AWS consistently has ~5% higher crash counts than PHX (I see that across the board with any data on any channel), 2) Switching over to the production URL of https://crash-analysis.mozilla.com/ when we feel ready?

Status: ASSIGNED → RESOLVED

Closed: 10 years ago

Flags: needinfo?(rhelmer)

Resolution: --- → FIXED

Robert Helmer [:rhelmer]

Assignee

Comment 12

•

10 years ago

(In reply to Robert Kaiser (:kairo@mozilla.com) - on vacation or slow to reply until the end of June from comment #11) > The cron ran fine today, so I'll mark this bug fixed. > > Rob, do we have open bugs on > 1) investigating why AWS consistently has ~5% higher crash counts than PHX > (I see that across the board with any data on any channel), Lars and I are investigating this now. In all the cases we've checked so far, PHX wasn't able to find the crash in S3 in time and AWS was. I'd like to understand a bit more why this is happening more in PHX, but it's not a bad thing - we've likely been undercounting by ~5% in PHX due to this error rate. This problem doesn't seem to happen at all in AWS in fact. > 2) Switching over to the production URL of > https://crash-analysis.mozilla.com/ when we feel ready? We're waiting on the SSL certs (bug 1173928), I'll file DNS bugs as services are ready.

Flags: needinfo?(rhelmer)

Robert Kaiser

Comment 13

•

10 years ago

Thanks, sounds good. Both of those.

Bugzilla

move crash-analysis.m.c/rkaiser/ to EC2

Categories

(Socorro :: General, task)

Tracking

(Not tracked)

People

(Reporter: rhelmer, Assigned: rhelmer)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Comment 2

Comment 3

Updated

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13