Closed Bug 1017339 Opened 11 years ago Closed 10 years ago

Create vertica authentication records for all vertica users to use external load balancer (zeus)

Categories

(Data & BI Services Team :: DB: MySQL, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mpressman, Assigned: mpressman)

Details

The current vertica setup is configured where access is controlled in a manner that doesn't manage external authentication. This is the default where username/password access is granted to the cluster essentially locally. In order to use zeus, we need to define external authentication. These records are similar function to a postgres server utilizing pg_hba.conf. In this case, the file is called vertica.conf. The setup is an all or nothing approach that requires adding all users to the vertica.conf. This is because once the file contains an entry it interprets that for all access
Assignee: server-ops-database → mpressman
I have created a template text file that will be the contents of the new populated vertica.conf. It contains all users granting them local access, which is what they currently have and access through vertica-zlb.metrics.scl3.mozilla.com (10.22.27.64). The current local access is so that any legacy connects will still be available while allowing access through the zeus vip with vertica in external authentication mode. :rfradinho - Can you ping me on irc when we can implement this? It should only take a few minutes. There shouldn't be any downtime, but I want to make sure when I do it I don't knock off or prevent anything from connecting.
Flags: needinfo?(rfradinho)
:rfradinho contacted me online to get this handled.
Flags: needinfo?(rfradinho)
As of right now, the plan is to implement this on Wednesday, June 11. I discovered a better way to implement this so that a database restart won't be required. It uses the set_config_parameter() function using the ClientAuthentication parameter. Additionally it provides a batch config change to further mitigate any users losing log in ability. The planned change is to run: dbadmin=> select set_config_parameter('ClientAuthentication', 'host 10.22.27.64/32 password, local all password'); This will set the default policy and still provide the ability to permit or reject access from remote locations to individual users. The first item requires all users connecting through zeus to provide their password. The second is essentially a continuation of the current default policy where any user attempting to connect locally from a node in the cluster is required to provide a password.
Just confirmed a time with :rfradinho. Wednesday, June 11 at 9:00am pacific.
This got pushed until later this afternoon due to a meeting conflict
We were able to get external auth implemented, but still unable to connect through zues. The same behavior is noticed when attempting to connect though zeus: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. While implementing we encountered some remote host connection problems, but have been able to at least get it in a state that connections work the same as they did prior to implementing external auth. I will now be able to modify connection configuration options using my own username so as to not disrupt any of the other services connecting to vertica.
This is looking more like a zeus specific config issue. Using the remote host etl1.metrics.scl3 (10.22.27.20) as a test for remote host connections as vertica-zlb.metrics.scl3.mozilla.com (10.22.27.64) would appear to vertica external auth. The following test show success and expected rejections based on the external auth mode configured. -Config setup where external auth trusts the remote connection, thus not requiring password from specified host: host mpressman 10.22.27.20/32 trust The connection authenticates successfully without requiring a password: [mpressman@etl1.metrics.scl3 ~]$ vsql -U mpressman -h vertica.metrics.scl3.mozilla.com Welcome to vsql, the Vertica Analytic Database interactive terminal. Type: \h or \? for help with vsql commands \g or terminate with semicolon to execute query \q to quit mpressman=> -Config setup where external auth rejects the remote connection, thus not allowing connections from the specified host: host mpressman 10.22.27.20/32 reject The connection is immediately rejected: [mpressman@etl1.metrics.scl3 ~]$ vsql -U mpressman -h vertica.metrics.scl3.mozilla.com vsql: FATAL 2248: Authentication failed for username "mpressman" -Config setup where connection requires password for the remote connection from the specified host: host mpressman 10.22.27.20/32 password [mpressman@etl1.metrics.scl3 ~]$ vsql -U mpressman -h vertica4.metrics.scl3.mozilla.com Password: Welcome to vsql, the Vertica Analytic Database interactive terminal. Type: \h or \? for help with vsql commands \g or terminate with semicolon to execute query \q to quit mpressman=> For each of the above three connection scenarios (trust, reject, password) connecting through vertica-zlb.metrics.scl3.mozilla.com -Trust config setup: host mpressman 10.22.27.64/32 trust [mpressman@etl1.metrics.scl3 ~]$ vsql -U mpressman -h 10.22.27.64 vsql: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. -Reject config setup: host mpressman 10.22.27.64/32 reject [mpressman@etl1.metrics.scl3 ~]$ vsql -U mpressman -h 10.22.27.64 vsql: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. -Password cconfig setup: host mpressman 10.22.27.64/32 password [mpressman@etl1.metrics.scl3 ~]$ vsql -U mpressman -h 10.22.27.64 vsql: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request.
Some ETLs with legacy settings failed on 6/12 with autentication problems (1024412) I've reverted the vertica.conf and restarted the cluster to unblock those ETLs.
Ricardo, please let me know when is the best time where we can test ALL services connection abilities
Flags: needinfo?(rfradinho)
Ricardo and I discussed this on irc and he mentioned that the ETL's that run every 30m from etl[1-2] were working, but that several failed were ones that run at 0400GMT. Ricardo, can you post the names and any relevant info about them so I can identify them to expand the auth param config options to see what I can do to get them working?
Flags: needinfo?(rfradinho)
Just reposting from comment 10 as it looks like I am still requiring info for a time to meet, yet that was resolved over irc. However, the request for info regarding the names and any relevant info about the ETL's that were failing so that I can identify them and get external auth working for those that failed is still an outstanding request.
Flags: needinfo?(rfradinho)
The ETLs that failed are: - EOD (Adi processing and exports - Optimizer - moves data into Vertica and into aggregates - Themes stats exports to AMO. They run at 4:00 AM GMT and they failed with: Error occured while trying to connect to the database Error connecting to database: (using class com.vertica.Driver) [Vertica][VJDBC](2248) FATAL: Authentication failed for username "etl_prod" The problem is on the kettle vertica datasource definition that needs to be changed as part of: https://bugzilla.mozilla.org/show_bug.cgi?id=1024412 This problem happened already in other etls so the fix should be the same. Until we changed the ETLs, we cannot enable the vertica external auth. At least not near 4 AM GMT.
Product: mozilla.org → Data & BI Services Team
Pentaho is decommissioned, so it's not using that any more. Tableau is using the tableau_read user, and is using the load balancer, and things seem to be working and updating. Just in case, can you check.
Flags: needinfo?(rfradinho)
Tableau can connect to vertica.metrics.scl3.mozilla.com with no problems, so I'm going to call this issue fixed. The etl boxes no longer exist, so if they had problems, it's a moot point now.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.