Closed Bug 538261 Opened 15 years ago Closed 15 years ago

Prod socorro DB, Services, Web deployment

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

All
Other
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ozten, Unassigned)

References

()

Details

Socorro 1.3 deployment instructions:
These instructions are an edited version of http://code.google.com/p/socorro/wiki/SocorroUpgrade#Socorro_1.3

This upgrade requires downtime. Please follow typical process (maintenance UI, clear caches when finished, etc)

Push is scheduled for Thursday evening Jan 7th, 2010.

Socorro Server
= Database =

1) ADU
We have a new database table that metrics will populate.
We should create this and start populating ASAP
Doesn't require downtime

Copied from https://bugzilla.mozilla.org/show_bug.cgi?id=537338

1.1) A new table has been added for the ADU information. To add it manually, execute this SQL:

CREATE TABLE raw_adu (
    adu_count integer,
    date timestamp without time zone,
    product_name text,
    product_os_platform text,
    product_os_version text,
    product_version text
);
1.2)
CREATE INDEX raw_adu_1_idx ON raw_adu (date, 
                                       product_name, 
                                       product_version,
                                       product_os_platform,
                                       product_os_version);
                                       
1.3)
Grant access from cm-metricsetl01 and cm-metricsetl02.

1.4) Nagios Postgres DB alarm
Create a Nagios check on raw_adu.date. Take the max value. It should never be more than 48 hours out off from the current time.

2) New index for 'top_crashers_by_signature' table

CREATE INDEX top_crashes_by_signature_window_end_productdims_id_idx on top_crashes_by_signature (window_end desc, productdims_id);

*** New Index for 'build' column on the reports partitions
-- Be on the lookout for a separate IT bug ***

= Services =

3) Deploy AduByDay
3.1) updated to the latest from trunk
3.2) Update webapiconf.py config
The string 'adubd.AduByDay?' needs to be added to the 'servicesList' configuration parameter in webapiconf.py. In addition there needs to be a import for that service.

import socorro.services.topCrashBySignatureTrends as tcbst
import socorro.services.signatureHistory as sighist
import socorro.services.aduByDay as adubd
servicesList = cm.Option()
servicesList.doc = 'a python list of classes to offer as services'
servicesList.default = [tcbst.TopCrashBySignatureTrends, sighist.SignatureHistory, adubd.AduByDay]

= Processor =
4) Update from trunk
A minor change to the processor will require that it be updated to the latest code from trunk. This change is from Bug 520230 that increased the size of a field in the 'extensions' table.

= Collector =

5) Same as step 4, The collector should have its code updated to the latest from trunk.

Bug 534656 The collector now accepts floating point percentages in the configuration parameter 'throttleConditions'. This will be useful for the anticipated flood when client side throttling changes in FF 3.6.
the Crons

A new configuration parameter called 'truncateUrlLength' has been added to the configuration file 'topCrashesByUrlConfig.py'. Use topCrashesByUrlConfig.py.dist as a template to add this new parameter. The actual recommended value has yet to be determined.

= Socorro UI =

6) Copy and edit daily.php in order enable ADU.

cp application/config/daily.php-dist application/config/daily.php
corrections:

3.2 - 'adubd.AduByDay?' should be 'adubd.AduByDay' - a wiki artifact?

4 - drop it, i was never able to get the database change made without disrupting everyone else.  I'll get griswolf to revert the processor source code change
as per my note in Comment #1 regarding 4 - no source checkin reversion is required
The separate IT bug mentioned in #0 2) is Bug 538313
Depends on: 538313
= Crons =
7) Update and configure crons
7.1) Process Bug#537841 for updates to startTopCrashesByUrl.py
3.3) Grant Socorro db user/pass read access to raw_adu
All done.  Code should sync out to the prod webheads in about 10 minutes.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
(In reply to comment #6)
Thanks Aravind. Verified release in prod.
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.