Hadoop Administrator (Cloudera) Interview Questions-3

Question-11: What is Hive Metastore?

Answer: Metastore is one of the RDBMS, which is required for Hive to work. It could be MySQL, PostGreSQL, Oracle etc. Usually metastore has following information (metadata) stored

Name of the tables
Columns in the table
Partition information
Hadoop specific information e.g. Data Files and their block locations.

Question-12: Can Hive metastore used by other Hadoop components?

Answer: Yes, Hive metastore contains the information regarding data stored on HDFS, so that other Hadoop components like Impala can leverage that. Even if you don't have Hive then also this Metastore would be used.

Question-13: What do you mean by Remote Mode of Metastore?

Answer: Remote mode means metastore should be running in its separate JVM process. And any other process which wanted to get connected with the Metastore for example HiveServer2, HCatalog, Impala etc. should use the Thrift network API.

Question-14: What is HiveServer2?

Answer: HiveServer2 is a server-side interface, you can assume it as a container for the Hive Execution Engine. For each client connection it creates a new execution context for Hive SQL request submitted by the client. Hive support for both JDBC and ODBC client, which uses the Thrift API.

Question-15: Can Hive use the Apache Spark as a computation engine?

Answer: Yes, traditionally Hive using MapReduce as a computation engine, but Spark is much faster than MapReduce, hence in all modern solution Hive mostly uses the Spark as computation engine.

Details: Category: Hadoop Administrator; Last Updated: 24 April 2021

Related Articles

Hadoop Administrator (Cloudera) Interview Questions-3