Question 45: You have an application which needs to consume data from various AWS Data services like EMR, RDS, Redshift and S3. You need a single and unified metadata repository for all the data received from various

sources with different formats. Which of the following service would be helpful for achieving given requirement?

1. AWS Glue Data Catalog

2. Hive MetaStore

3. AWS MySQL RDS

4. AWS Connect

Correct Answer : 1 Exp : Use AWS Glue Data Catalog when you require a persistent metastore or a metastore shared by different clusters, services, applications, or AWS accounts.

AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it simple and cost-effective to categorize your data, clean it, enrich it, and move it reliably between various data stores. The AWS

Glue Data Catalog provides a unified metadata repository across a variety of data sources and data formats, integrating with Amazon EMR as well as Amazon RDS, Amazon Redshift, Redshift Spectrum, Athena, and any

application compatible with the Apache Hive metastore. AWS Glue crawlers can automatically infer schema from source data in Amazon S3 and store the associated metadata in the Data Catalog.

1