Question-16: Can you run Hive queries on HBase NoSQL database?
Answer: Yes. HBase is a NoSQL database which supports the real time read/write access to the large datasets in HDFS. We can run Hive queries on the HBase database as well.
Question-17: Can we use Amazon RDS (Relational Database Service) as Hive Metastore?
Answer: Yes, we can. Because AWS RDS is a service which provides the managed RDBMS solution like Oracle, MySQL etc.
Question-18: What is Impala?
Answer: Impala is fast query engine for running interactive queries on the data stored in HDFS, HBase or AWS S3. Even Impala use the same query syntax as Hive.
Question-19: Impala is a replacement for the Hive?
Answer: No. You can say Impala as an additional tool for querying BigData. Impala is better suited for interactive query, while Hive is better suited for the batch processing e.g. ETL.
Question-20: What are the advantages of the Impala over other existing BI/Reporting tools?
Answer: Following are the benefits of the Impala
- SQL query syntax same as existing SQL
- It can query high volume of the data on Hadoop.
- Distributed queries for high performance.
- Best fit with Hive. Because Impala can read from and write to Hive tables.