perspective as well as job must be finished in an hour. Which is the best solution for the given requirement from the below options?
A. You will be using the lowest bid on daily basis and create cluster using Spot instances.
B. To save the cost you will reserve the instance for maximum possible period. And create the EMR cluster using this reserved nodes.
C. You will be using instance fleet configuration for creating the EMR cluster.
D. You will always use 100 reserve instances and 100 spot instances. So average cost will be maintained.
E. You will be using combination of On-Demand and spot instances for core and task nodes.
1. A,B
2. B,C
3. C,D
4. C,E
5. A,E
Correct Answer : 4 Exp : Now you need to setup cluster such a way that cost is minimum and you need to run cluster only for one hour each day. Then why would you reserve instances for that which un-necessary increase
the cost. Hence, option-B cannot be correct.
It is not possible that so many instances you can always get at the lowest bid price on daily basis. Hence, we can say option-A cannot be an ideal solution. Similarly option-D why would you want to reserve 100
instances, which can increase the cost. Hence, option-D cannot be a correct option.
Now option remain is C and E and check below concept from AWS documentation
Using the instance fleet configuration in EMR cluster you can provision various options from EC2 instances like what is the target capacity for On-Demand instances and Spot instances in each fleet. So while launching
the cluster EMR provisions instances until specified target is fulfilled. You can specify up to five EC2 instance types per fleet for EMR to use when fulfilling the targets. You can also select multiple subnets for
different Availability Zones. When Amazon EMR launches the cluster, it looks across those subnets to find the instances and purchasing options you specify.
While a cluster is running, if Amazon EC2 reclaims a Spot Instance because of a price increase, or an instance fails, Amazon EMR tries to replace the instance with any of the instance types that you specify. This
makes it easier to regain capacity during a spike in Spot pricing. Instance fleets allow you to develop a flexible and elastic resourcing strategy for each node type. For example, within specific fleets, you can have
a core of On-Demand capacity supplemented with less-expensive Spot capacity if available, and then switch to On-Demand capacity if Spot isnt available at your price.
Summary of Key Features:
One instance fleet, and only one, per node type (master, core, task). Up to five EC2 instance types specified for each fleet.
Amazon EMR chooses any or all of the five EC2 instance types to provision with both Spot and On-Demand purchasing options.
Establish target capacities for Spot and On-Demand Instances for the core fleet and task fleet. Use vCPU or a generic unit assigned to each EC2 instance that counts toward the targets. Amazon EMR provisions instances
until each target capacity is totally fulfilled. For the master fleet, the target is always one.
Choose one subnet (Availability Zone) or a range. Amazon EMR provisions capacity in the Availability Zone that is the best fit.
When you specify a target capacity for Spot Instances:
For each instance type, specify a maximum Spot price. Amazon EMR provisions Spot Instances if the Spot price is below the maximum Spot price. You pay the Spot price, not necessarily the maximum Spot price.
Optionally, specify a defined duration (also known as a Spot block) for each fleet. Spot Instances terminate only after the defined duration expires.
For each fleet, define a timeout period for provisioning Spot Instances. If Amazon EMR cant provision Spot capacity, you can terminate the cluster or switch to provisioning On-Demand capacity instead.
Hence, option C and E are correct.
4