In order to get information from an existing EMR cluster, we can use

1
PS S:\ Get-EMRCluster -ClusterId $ClusterId

The command will then return a system object in Amazon.ElasticMapReduce.Model.Cluster type.

The Cluster object provides the following attributes that maybe useful

  • MasterPublicDnsName. The DNS name of the master node.
  • NormalizedInstanceHours. An approximation of the cost of the cluster.
  • ReleaseLabel. The release label of Amazon EMR.
  • Status. The current status details about the cluster.

There is another function to extract the EMR Instance Group information, and that is

1
PS S:\ Get-EMRInstanceGroupList  -ClusterId $ClusterId

This function will return a Amazon.ElasticMapReduce.Model.InstanceGroup object. We can then use the results to extract information like InstanceType, RunningInstanceCount for diffferent Instance Group.

The followings are some attributes we may use

  • BidPrice. The maximum Spot price you are willing to pay for EC2 instances.
  • InstanceGroupType. The type of the instance group. Validate values are MASTER, CORE OR TASK.
  • InstanceType. The EC2 instance type for all instances in the instance group.
  • RunningInstanceCount. The number of the instances currently running in this instance group
  • Status. The current status of the instance group.

For example, if we want to know the instance type and running instance counts for CORE Node instance group, we can use

1
2
3
4
5
6
$instanceGroup = Get-EMRInstanceGroupList  -ClusterId $ClusterId
write-host $instanceGroup.InstanceType
# r5.4xlarge r5.4xlarge r5.4xlarge

write-host $instanceGroup.RunningInstanceCount
# 40 30 1

You will see there are three values in the results section, and the order is TASK, CORE, and MASTER. So for our example of getting the instance type and running instance counts for CORE Node instance group, we should use

1
2
3
4
5
write-host $instanceGroup.InstanceType[1]
# r5.4xlarge

write-host $instanceGroup.RunningInstanceCount[1]
# 30