571228 - Push Socorro 1.7 to production

Reporter

Description

•

14 years ago

The tag for this release is http://socorro.googlecode.com/svn/tags/releases/1.7_r2148_20100610/

Upgrade instructions are at 
http://code.google.com/p/socorro/wiki/SocorroUpgrade#Socorro_1.7

We have a pre-release meeting in #362 at 10 PT, and will push immediately after, probably around 11PT.

Laura Thomson :laura

Reporter

Updated

•

14 years ago

Assignee: server-ops → aravind

Aravind Gottipati [:aravind]

Assignee

Comment 1

•

14 years ago

Working on collector configs, waiting for Daniel to turn off crash submissions into HBase.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 2

•

14 years ago

Downloaded and extracted new hbase version

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 3

•

14 years ago

Symlinked to 0.20.5
copied CDH 2 hadoop jars to hadoop/lib

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 4

•

14 years ago

LZO jar and .so copied to /usr/lib/hbase-0.20.5/lib and .so symlinked.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 5

•

14 years ago

Official production configs for hbase 0.20.5 copied from ~deinspanjer/hbase_conf to /usr/lib/hbase-0.20.5/conf

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 6

•

14 years ago

hbase-0.20.5-20100602 pushed to all production cluster
chowned to hadoop.hadoop
symlinked to hbase-0.20.5

Aravind Gottipati [:aravind]

Assignee

Comment 7

•

14 years ago

Turned off crash submissions to HBase.  Still collecting into nfs.

Still working on other config files.

Aravind Gottipati [:aravind]

Assignee

Comment 8

•

14 years ago

Copied the maintenance page over the index page, that should propagate out in like 10 minutes.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 9

•

14 years ago

Thrift stopped.  Starting hbase master shutdown

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 10

•

14 years ago

Master stopped
updated /usr/lib/hbase to point at hbase-0.20.5
Starting master

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 11

•

14 years ago

Master started.  Waiting for regions to be assigned.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 12

•

14 years ago

Ran set_meta_block_cache.
Upgrading schema according to https://svn.mozilla.org/metrics/hadoop/socorro/hbase/schema_migration.txt

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 13

•

14 years ago

Almost done with expensive migrations.  Create table statements will be fast.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 14

•

14 years ago

Wrong with almost done.  The second alter is taking much longer than the first.  Unfortunately, I don't have *any* way to tell progress at the moment..

Need one more migration to up the max region file size for the existing crash_reports table. Not in the migration file currently.

alter 'crash_reports', {METHOD => 'table_att', MAX_FILESIZE => '1073741824'}

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 15

•

14 years ago

Second one took 35 minutes.
Doing the table_att MAX_FILESIZE one now.
I think the third alter might be quicker like the first.  The second was probably very slow because the flags: column family already existed.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 16

•

14 years ago

MAX_FILESIZE alter done. 792 seconds.
Doing last alter now.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 17

•

14 years ago

Last alter took 162 seconds.
Enabling crash reports table now.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 18

•

14 years ago

Done enabling big table.
Creating new tables now.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 19

•

14 years ago

Done with all migrations.
Started thrift servers.
HBase upgrade complete

Aravind Gottipati [:aravind]

Assignee

Comment 20

•

14 years ago

Done with the collectors, monitor, processors and the middleware layer.

Working on the php front end next.

Aravind Gottipati [:aravind]

Assignee

Comment 21

•

14 years ago

I am now done with webapp and cron jobs as well.  All done..

Austin King [:ozten]

Comment 22

•

14 years ago

Please restart web service layer. webapp-php can't talk to it.

Austin King [:ozten]

Comment 23

•

14 years ago

Did a crash me now for d62d8c64-206c-4523-852d-ba5b12100610

it's in hadoop

http://crash-stats.mozilla.com/dumps/d62d8c64-206c-4523-852d-ba5b12100610.jsonz
is a 500 system error
"Internal Server Error

The server encountered an internal error or misconfiguration and was unable to complete your request.

Please contact the server administrator, root@localhost and inform them of the time the error occurred, and anything you might have done that may have caused the error.

More information about this error may be available in the server error log.
Apache/2.2.3 (Red Hat) Server at dm-bp-mware01.mozilla.org Port 80"

Ryan Snyder [:ryansnyder] [:rsnyder] [:rysny]

Comment 24

•

14 years ago

The advanced search page at http://crash-stats.mozilla.com/query is encountering connection reset errors with every single query and has not completed a successful query yet.

Ryan Snyder [:ryansnyder] [:rsnyder] [:rysny]

Comment 25

•

14 years ago

Upon the first page load at http://crash-stats.mozilla.com/query, I am receiving this error:

"The maximum query date range you can perform is days. Admins may log in to increase query date range limits. Query results have been narrowed to the default range of ."

This leads me to believe that a config file is missing certain values.  Please ensure this array is found at the bottom of application/config/application.php:

/**
 * The query range limit for users who have the role of user and admin.
 *
 * @see My_SearchReportHelper->normalizeDateUnitAndValue()
 */
$config['query_range_defaults'] = array(
    'admin' => array(
        'range_default_value' => 14,
        'range_default_unit' => 'days',
        'range_limit_value_in_days' => 120
    ),
    'user' => array(
        'range_default_value' => 14,
        'range_default_unit' => 'days',
        'range_limit_value_in_days' => 30
    )
);

Austin King [:ozten]

Comment 26

•

14 years ago

(In reply to comment #22)
(In reply to comment #23)
These are fixed.

Aravind Gottipati [:aravind]

Assignee

Comment 27

•

14 years ago

Running the daily crash job now.  Its set to run in cron at 00:15

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Austin King [:ozten]

Comment 28

•

14 years ago

Verification status:

Many features are working except:

1) ADU Report as noted in Comment #27
2) Most Search Queries are timing out

Performance has been improving and load on Postgres has dropped form 8 to 6 to 1 in the last hour. We're going to build out #1 and regroup at 6:40pm to see how #2 is looking.

Austin King [:ozten]

Comment 29

•

14 years ago

(In reply to comment #28)
WTF... #2 advanced search is fixed.

Austin King [:ozten]

Comment 30

•

14 years ago

http://crash-stats.mozilla.com/daily is working now.

Verification complete.

Stephen Donner [:stephend] Not actively reading bugmail

Comment 31

•

14 years ago

I can't verify all the bugs pushed in 1.7, but I've run through a series of post-push tests, and it's looking good to me (plus comment 29 and comment 30; thanks, Austin!)

Verified.

Status: RESOLVED → VERIFIED

Nobody; OK to take it and work on it

Updated

•

11 years ago

Component: Server Operations: Web Operations → WebOps: Other

Product: mozilla.org → Infrastructure & Operations

BMO Automation

Updated

•

5 years ago

Product: Infrastructure & Operations → Infrastructure & Operations Graveyard