Question-1: What is the Cloudera Enterprise?

Answer: Cloudera Enterprise is a combined solution for the Machine Learning, Analytics, Data Engineering etc. Which include the following solutions.

  • Data Warehouse
  • Data Science
  • Data Engineering
  • Operational Database
  • Run in Cloud, Multi-Cloud or Hybrid Cloud solutions. 

Basically, it’s a combination of following 3 things

  • Open Source CDH (Include Hadoop & its Eco-system)
  • Cloudera Manager (Licensed product from Cloudera )
  • Cloudera Navigator (Licensed Product from Cloudera )

 

 

Question-2: How does Cloudera Enterprise Differ with the Cloudera Altus?

Answer: Cloudera Altus provide almost the same solution which is provided by Cloudera Enterprise. But in the public cloud like AWS, Azure and Google Cloud. 

 

Question-3: Can you please explain what is the use of Cloudera (SDX) Shared Data Experience?

Answer: Using the Cloudera ’s various solutions like Cloudera Enterprise we can have Data warehouse, data engineering, operational databases workloads altogether on the single platform. In such cases Cloudera Shared Data Experience (SDX) enables these diverse analytic processes to operate against a shared data catalog while having security, governance policies and schema. Even your entire Cloud environment is terminated, it still persists the all the metadata information. 

 

Question-4: Can you please tell me which all components are used as part of Cloudera Data Warehouse solution?

Answer: Currently below 5 major components used to make Cloudera Data Warehouse solution.

  • Apache Impala: You can run SQL/BI analytics on the data stored in either of the following
    • AWS S3
    • Microsoft Azure Data Lake 
    • HDFS
    • Apache Kudu
  • Hive on Spark: This helps in creating faster ETL/ELT solution for BI and Reporting.
  • HUE (SQL Development Workbench): At a time 1000’s of SQL developer.
  • Workload XM: It is used to analyze the current workload, query analysis and optimization of the cluster resources.
  • Cloudera Navigator

 

Question-5:  As part of Cloudera Enterprise Data Science solution, which all are underlined product majorly used or it runs on?

Answer: Currently below 3 major components are used

  • Cloudera Data Science Workbench: CDSW provides the on-demand access to Runtime for R, Python, Scala and integration with the Spark framework on CDH. Even for deep learning it supports the GPU accelerated computing and data scientists can use a framework like TensorFlow, MXNet, Keras etc. 
  • Apache Spark: Using Spark you can run in-memory processing.
  • Cloudera Fast Forward Labs: Using this you can design and execute your enterprise machine learning strategy.