Question-21: What are all the possible clients for Impala?
Answer: You can have the following components as an Impala client, which can query or administer the Impala environment.
- Hue (Web Interface for querying)
- Impala Shell
- ODBC
- JDBC
Question-22: Can Impala use the Hive Metastore?
Answer: Yes, Hive Metastore has the information about available data and let it know structure of the data, schema, table name, column names etc.
Question-23: Can you please give me basic overview, how the queries are executed in case of Impala?
Answer: There is a process named Impala which runs on each DataNode on HDFS, which is responsible for executing and co-ordinating the queries. Each instance of the Impala can receive, plan & co-ordinate queries from Impala client. Queries would be distributed among Impala nodes, and these nodes then act as workers, execute queries in parallel.
Question-24: What is Apache Kudu?
Answer: Apache Kudu is a columnar storage manager, developed for Hadoop platform. Kudu also shares the same common technical properties of Hadoop Ecosystem as below
- Runs on commodity Hardware
- Horizontally Scalable
- Highly available operations
Question-25: Can you please tell me some benefits of the Apache Kudu?
Answer: Following are the few benefits of the Kudu
- Fast processing of OLAP workloads
- It can be easily integrated with the MapReduce, Spark, Flume & Other Hadoop Components.
- Tight integration with Impala
- Strong but flexible consistency e.g. consistency per request basis.
- Highly performant for running sequential and random workloads simultaneously.
- Can be managed using Cloudera Manager
- Structured Data Model
- Highly available