Question-16: Can you run Hive queries on HBase NoSQL database?

Answer: Yes. HBase is a NoSQL database which supports the real time read/write access to the large datasets in HDFS. We can run Hive queries on the HBase database as well. 

 

Question-17: Can we use Amazon RDS (Relational Database Service) as Hive Metastore?

Answer: Yes, we can. Because AWS RDS is a service which provides the managed RDBMS solution like Oracle, MySQL etc. 

 

Question-18: What is Impala?

Answer: Impala is fast query engine for running interactive queries on the data stored in HDFS, HBase or AWS S3. Even Impala use the same query syntax as Hive. 

 

Question-19: Impala is a replacement for the Hive?

Answer: No. You can say Impala as an additional tool for querying BigData. Impala is better suited for interactive query, while Hive is better suited for the batch processing e.g. ETL. 

 

Question-20: What are the advantages of the Impala over other existing BI/Reporting tools?

Answer: Following are the benefits of the Impala

  • SQL query syntax same as existing SQL
  • It can query high volume of the data on Hadoop.
  • Distributed queries for high performance.
  • Best fit with Hive. Because Impala can read from and write to Hive tables.