Include a "shutdown" script you can run from the master node of a spark cluster

NEW
Unassigned

Status

Data Platform and Tools
Spark
P3
normal
2 years ago
4 months ago

People

(Reporter: mreid, Unassigned)

Tracking

(Blocks: 1 bug)

Details

(Reporter)

Description

2 years ago
It would be nice to be able to instruct the master node of a spark cluster to kill itself after doing some work.  Then you wouldn't have to watch a job until completion, just call the "shutdown" command at the end of the job.
Yes, IIRC the node uses halt 1440 to shut the cluter down. I guess such a script would kill any halts and issue a shutdown now command.
Points: --- → 2
Priority: -- → P3
(Reporter)

Updated

7 months ago
Blocks: 1357749

Updated

5 months ago
Component: Metrics: Pipeline → Spark
Product: Cloud Services → Data Platform and Tools

Comment 2

4 months ago
this is my first post. 
I would like to add this feature to the script. please let me know where the scripts are and how to access them.
(In reply to Maggie from comment #2)
> this is my first post. 
> I would like to add this feature to the script. please let me know where the
> scripts are and how to access them.

Hi Maggie, glad to have you!

You can find our spark EMR files here: https://github.com/mozilla/emr-bootstrap-spark/

This file would probably live in a "shutdown" directory in [1]. This script will need to shut down both the master node and the executors. My first thought would be an ssh command to shutdown (like our bootstrap script [2]), run via ssh from the master [3].

Do you have access to AWS to test this?

[1] https://github.com/mozilla/emr-bootstrap-spark/tree/master/ansible/files
[2] https://github.com/mozilla/emr-bootstrap-spark/blob/master/ansible/files/bootstrap/telemetry.sh#L171
[3] https://github.com/Yelp/mrjob/wiki/Accessing-Elastic-MapReduce-slave-nodes-via-SSH

Comment 4

4 months ago
Thanks Frank. I don't have access to AWS to test this. How do I go about to getting access to AWS ?

Comment 5

4 months ago
should i use my own credit card and apply for an AWS account , so that I can test this out? Does this how it works or how do we gain access to AWS... otherwise?
(In reply to Maggie from comment #5)
> should i use my own credit card and apply for an AWS account , so that I can
> test this out? Does this how it works or how do we gain access to AWS...
> otherwise?

I'm not sure what your status is as a contractor. Can you talk to your manager about AWS access?

Comment 7

4 months ago
I'm just volunteering to fix this bug as I know about ssh and scripting in unix. 
( and i used to have an aws account two years ago where i did ssh and wrote some scripts for devops testing purposes)

Comment 8

4 months ago
i can use my personal aws account, as long as it doesn't get charged for accessing any.
(In reply to Maggie from comment #8)
> i can use my personal aws account, as long as it doesn't get charged for
> accessing any.

Certainly you *can* use your own, but it does cost money to launch an EMR cluster. I'm not sure that there is a straightforward way for you to work on this bug without AWS access, unfortunately. If you would like to try anyways, feel free!

Comment 10

4 months ago
i started using this https://analysis.telemetry.mozilla.org/ to spin an EMR instance. 
But, it is prompting for an ssh key. 
please let me know how i could add a ssh key to it.
Hi Maggie,

Good to hear you have a way to test stuff out! You can read about ssh keys at [0]. Make sure you upload your PUBLIC key to ATMO!

[0] https://mana.mozilla.org/wiki/display/SD/Publishing+you+Public+Key+to+Mozilla+Systems

Comment 12

4 months ago
i will try that url.  Was trying to use putty-keygen to generate ssh-key. 
Will check the above url.

Comment 13

4 months ago
that worked. ssh-key added and launched an emr instance!

Comment 14

4 months ago
Launched EMR and i'm able to access EMR using ssh from commandline.

       __|  __|_  )
       _|  (     /   Amazon Linux AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-ami/2017.03-release-notes/
3 package(s) needed for security, out of 6 available
Run "sudo yum update" to apply all updates.

EEEEEEEEEEEEEEEEEEEE MMMMMMMM           MMMMMMMM RRRRRRRRRRRRRRR
E::::::::::::::::::E M:::::::M         M:::::::M R::::::::::::::R
EE:::::EEEEEEEEE:::E M::::::::M       M::::::::M R:::::RRRRRR:::::R
  E::::E       EEEEE M:::::::::M     M:::::::::M RR::::R      R::::R
  E::::E             M::::::M:::M   M:::M::::::M   R:::R      R::::R
  E:::::EEEEEEEEEE   M:::::M M:::M M:::M M:::::M   R:::RRRRRR:::::R
  E::::::::::::::E   M:::::M  M:::M:::M  M:::::M   R:::::::::::RR
  E:::::EEEEEEEEEE   M:::::M   M:::::M   M:::::M   R:::RRRRRR::::R
  E::::E             M:::::M    M:::M    M:::::M   R:::R      R::::R
  E::::E       EEEEE M:::::M     MMM     M:::::M   R:::R      R::::R
EE:::::EEEEEEEE::::E M:::::M             M:::::M   R:::R      R::::R
E::::::::::::::::::E M:::::M             M:::::M RR::::R      R::::R
EEEEEEEEEEEEEEEEEEEE MMMMMMM             MMMMMMM RRRRRRR      RRRRRR

-bash-4.2$

Comment 15

4 months ago
-bash-4.2$ emr list-clusters
-bash: emr: command not found
-bash-4.2$

Comment 16

4 months ago
trying to run this cmd and getting errs.
not sure if i am executing from the right place.
tried it in gitbash console as well as in emr console. both raises err.

emr --describe j-JOBFLOWID | grep MasterPublicDnsName | cut -d'"' -f4 ec2-54-218-87-10.us-west-2.compute.amazonaws.com


 emr --describe j-JOBFLOWID | grep MasterPublicDnsName | cut -d'"' -f4 hadoop@ec2-54-218-87-10.us-west-2.compute.amazonaws.com



-bash-4.2$ emr list-clusters
-bash: emr: command not found
-bash-4.2$
-bash-4.2$ emr --describe j-JOBFLOWID | grep MasterPublicDnsName | cut -d'"' -f4 hadoop@ec2-54-218-87-10.us-west-2.compute.amazonaws.com
-bash: emr: command not found
cut: hadoop@ec2-54-218-87-10.us-west-2.compute.amazonaws.com: No such file or directory
-bash-4.2$ emr --describe j-JOBFLOWID | grep MasterPublicDnsName | cut -d'"' -f4 ec2-54-218-87-10.us-west-2.compute.amazonaws.com
-bash: emr: command not found
cut: ec2-54-218-87-10.us-west-2.compute.amazonaws.com: No such file or directory
-bash-4.2$

Comment 17

4 months ago
i guess this is the right command (without the username hadoop in it) 
emr --describe j-JOBFLOWID | grep MasterPublicDnsName | cut -d'"' -f4 ec2-54-218-87-10.us-west-2.compute.amazonaws.com

and still it raises err.

Comment 18

4 months ago
thanks Frank!

Comment 19

4 months ago
my jupyter notebook appears to be up and running at this url, http://localhost:8888/tree

this is what i have tried so far to set this up...
bash-4.2$ aws emr --describe j-JOBFLOWID | grep MasterPublicDnsName | cut -d'"' -f4 ec2-54-218-87-10.us-west-2.compute.amazonaws.com
cut: ec2-54-218-87-10.us-west-2.compute.amazonaws.com: No such file or directory
usage: aws [options] <command> <subcommand> [<subcommand> ...] [parameters]

AWS REGION and masternode name
i was able to configure the aws emr region now
then will list_clusters to get the master-node name / dns
( may be will have to put in a config file for aws emr )


I guess I will have to use  list_clusters to get the master-node name / dns.

But, list_clusters raises a permission err.  this is where i am at now...

-bash-4.2$ aws configure set region us-east-1
-bash-4.2$
-bash-4.2$ aws emr list-clusters

An error occurred (AccessDeniedException) when calling the ListClusters operation: User: arn:aws:sts::927034868273:assumed-role/tel                           emetry-spark-cloudformation-TelemetrySparkRole-RHL50R5U270K/i-00649beb5c420a252 is not authorized to perform: elasticmapreduce:List                           Clusters
-bash-4.2$
You need to log in before you can comment on or make changes to this bug.