Which of the following is best and cheaper solution for the given requirement?
1. You will ask your development team to create an EMR cluster with the 100 nodes on daily, whenever you need, developer will create it using the AWS console.
2. You will ask your development team to write a script which can create 100 nodes EMR cluster which can use the EMR API.
3. You will ask your development team to write a script which can create 100 nodes EMR cluster which can use the EMR Command.
4. You will ask AWS support to provide 100 node EMR cluster for daily one hour.
Correct Answer : 2 Exp : Now in this question, some hidden agenda to test. Like when 100 node EMR cluster is launched after the Job completion it should be terminated, so that it does not incur un-necessary charges
like for remaining 23 hrs X 100 nodes, which can save huge amount of money.
Now in the given option you need to choose the option which can fulfill our requirement with minimal effort.
Option-1: Creating EMR cluster by the development team using AWS console, will require every day involvement of development team to start and stop the cluster. When your nodes are in stopped state there would not be
charge on that. However, it is not an ideal solution.
Option-3: Creating EMR cluster by the development team using AWS Command, will require every day involvement of development team to start and stop the cluster. When your nodes are in stopped state there would not be
charge on that. However, it is not an ideal solution.
Option-4: Why do you want to involve AWS Support team to create EMR cluster and also there is nothing like that AWS support can provide EMR cluster for every one hour on daily basis. They can help you in creating such
cluster but not provide on daily basis.
Option-2: By default, clusters that you create using the console or the AWS CLI continue to run until you shut them down. To have a cluster terminate after running steps, you need to enable auto-termination. In
contrast, clusters that you launch using the EMR API have auto-termination enabled by default.
As you can see that EMR cluster launched using EMR API, by default have auto termination enabled, so whenever job is finished it can terminate the EMR cluster.
2