Closed Bug 708452 Opened 11 years ago Closed 10 years ago

Provision a VM in PHX for Socorro team to run Hive, Pig, etc

Categories

(Mozilla Metrics :: Metrics Operations, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Unreviewed

People

(Reporter: laura, Assigned: tmary)

Details

We'd like to try out some of these tools against the Socorro secondary HBase.  This may eventually evolve into a jumphost for Platform to run queries
Can you cc me on bug 708786?
I've done some initial set up on gsgw1000.metrics.phx1.mozilla.com.  

tmary: 

In puppet, I pointed the Hadoop client configuration of this server at phx_prd_cluster01.  It looks like Hive and Pig are installed.  Is there something specific that I can do to test to make sure that they are configured correctly?

Is there anything else that you know of that should be installed?

laura: 

1) Are there any other particular tools that you know that you wanted installed?
2) Who should be given access to this server?  Is there an existing server (like sp-admin01.phx1.mozilla.com) whose user list I should match?
Depends on: 717346
No longer depends on: 717346
(bump)

:laura

1) Hive and Pig are installed on a VM for people to run tools against the Socorro secondary HBase.  Are there any other tools you want installed?

2) Who needs to be given access to this VM?
Status: NEW → ASSIGNED
Hi :cyliang,

1.  That seems like a great start.  

2.  Initial access list:
- laura
- rhelmer
- lars
- lonnen
- peterbe
- adrian
- brandonsavage
- espressive
- kairo
- dmandelin

Thanks!
:laura

I think I've added the correct accounts for everyone except kairo.  I couldn't find a Mozilla Phonebook entry for that nick and failed to find a bugzilla account with that username.  Is there any more info you can give to me so I can pinpoint the proper account name to add?
(In reply to C. Liang [:cyliang] from comment #5)
> :laura
> 
> I think I've added the correct accounts for everyone except kairo.  I
> couldn't find a Mozilla Phonebook entry for that nick and failed to find a
> bugzilla account with that username.  Is there any more info you can give to
> me so I can pinpoint the proper account name to add?

The alias kairo@mozilla.com points to rkaiser@mozilla.com - I think most of my stuff is on the much older kairo@kairo.at account, though. Or is this linked or even the same somewhere in the LDAP backend anyhow?
:kairo -- your account has been added to the server.

*laura -- unless there's anything else you can think of at the moment, I'll close this ticket on Monday.



Any account linking is done manually, by me, with the help of the Phonebook. =)

* "KaiRo" brings up the irc entry, so it looks like IRC nicknames are searched in a case-sensitive manner. 

* "kai" brings up the rkaiser entry as well as one for azakai.

* "kairo" brings up nothing, which is odd as it *does* appear in rkaiser mail and bugmail entries.  ("jono" works, for example, although that might be due to the GTalk entry.)
Closing this ticket.  If any new or additional issues come up, please open new bugs.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
I just got access to this VM but I'm not sure how to use it: has anyone else on this bug actually used pig on this machine? In particular I don't know where Java is to set JAVA_HOME.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #9)
> I just got access to this VM but I'm not sure how to use it: has anyone else
> on this bug actually used pig on this machine? In particular I don't know
> where Java is to set JAVA_HOME.

"/usr/bin/java -version" reports 1.6.0(...) so I set:
export JAVA_HOME=/usr/lib/jvm/java-1.6.0/

That seems to wfm.
Trying to run a pig job, it's able to connect to HDFS and the job tracker ok, but trying to use localhost for zookeeper:

2012-06-28 14:56:54,258 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://hp-node70.phx1.mozilla.com:8020
2012-06-28 14:56:54,458 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: hp-node70.phx1.mozilla.com:8021
(...)
2012-06-28 14:56:58,346 [Thread-4-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181

Which of course does not work:

2012-06-28 14:56:58,356 [Thread-4-SendThread(localhost:2181)] WARN  org.apache.zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)


This machine does only has /etc/hadoop/ not /etc/hbase/ or /etc/zookeeper/ could that be related?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
This blocks correlation reports and hence migration to Django.  Can somebody please take a look?
Based on a conversation in irc, it sounds like tmary is in the process of reworking this server (including a DNS name change).  I'm re-assigning this bug to him since there may be other configuration changes that I'm not aware of.
Assignee: cliang → tmeyarivan
Status: REOPENED → RESOLVED
Closed: 11 years ago10 years ago
Resolution: --- → FIXED
I'm having some issues running pig jobs on here; can we please get shell access to xstevens so he can help diagnose?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.