Question-26: What kind of applications where Kudu best fit?
Answer: There are following things which are difficult to implement on currently available Hadoop Technologies, but Kudu can help
- Reporting application: Where new data must be immediately available for end users.
- Time-series applications: Querying large amount of historic data as well as granular queries on individual entity.
- Predictive Models: Application which uses the predictive models for making real-time decisions, with the periodic refreshes of the predictive models based on historical data.
Question-27: What is Apache Sentry?
Answer: Apache Sentry is a granular, role-based authorization module for Hadoop. It is used as a plugin for authorization engine for Hadoop components. Using this we can define authorization rules to validate a user or application’s access requests for Hadoop resources.
Question-28: On Cloudera CDH6, which all are cluster manager supported for Apache Spark?
Answer: Since CDH6 Spark Standalone Cluster Manager is no more supported, you have to use YARN as a cluster manager. On CDH6, Spark 1.6 is also not supported.
Question-29: What is the use of Cloudera Manager component?
Answer: Cloudera Manager is an end-to-end application for managing CDH clusters. With this we can easily deploy and centrally operate the complete CDH stack and other managed services.
Question-30: Can you please explain how Cloudera Manager, hosts and Cluster are associated?
Answer: Cloudera Manager is a logical entity that contains a set of hosts.
- On the hosts only single version of CDH is installed e.g. either CDH-5 or CDH-6
- Services and Role instances run on the Hosts e.g. HDFS instance
- A single host can belong to only one cluster.
- Cloudera Manager can manage more than one cluster.
- A single cluster can only be associated with the single Cloudera Manager