efficiently so that processing engine can take advantage of that storage. Volume of data goes in GBs. Which of the following is an ideal solution for a given requirement?
1. You can use AWS Redshift Cluster.
2. You can use AWS S3 for that
3. You can use AWS DynamoDB
4. You can use AWS RDS
5. You can use EMR with the HBase
Correct Answer : 5 Exp : Apache HBase, a Hadoop NoSQL database, offers the following benefits:
- Efficient storage of sparse data-Apache HBase provides fault-tolerant storage for large quantities of sparse data using column-based compression. Apache HBase is capable of storing and processing billions of rows
and millions of columns per row.
- Store for high frequency counters-Apache HBase is suitable for tasks such as high-speed counter aggregation because of its consistent reads and writes.
- High write throughput and update rates-Apache HBase supports low latency lookups and range scans, efficient updates and deletions of individual records, and high write throughput.
- Support for multiple Hadoop jobs-The Apache HBase data store allows data to be used by one or more Hadoop jobs on a single cluster or across multiple Hadoop clusters.
Based on this we can say option-5 is correct. All other options are also storage solution, but does not satisfy the requirement given in the question.
5