HadoopExam Learning Resources


BigData | DataScience | IOT | Cloud | DevOps | ITRisk | AI |BlockChain 



    25000+ Learners upgraded/switched career    Testimonials

All Certifications preparation material is for renowned vendors like Cloudera, MapR, EMC, Databricks,SAS, Datastax, Oracle, NetApp etc , which has more value, reliability and consideration in industry other than any training institutional certifications.
Note : You can choose more than one product to have custome package created from below and send email to hadoopexam@gmail.com to get discount.

Do you know?
Premium Trainings Courses :  HadoopExam focuses on in depth learning with the hands-on session setting up the environment than executing solution and doing hands on that. Below are the available trainings and we are keep adding new trainings. These trainings is being used and subscribed by Devloper, Tester, Administrator, Enterprise(to train their team) and Trainer globally. These trainings are well organized and step by step solutions to learning, and in lesser time as per your convenience you can complete these and even re-visit as required.

Read Spark SQL Fundamental and Cookbookhttps://sites.google.com/training4exam.com/spark-sql-2-x-fundamentals/

Table of Contents

Chapter-1: Spark SQL Introduction
  • Introduction
  • Sample program using both SQL Query and API
Chapter-2 Catalyst Optimizer.
  • Catalyst optimizer Introduction.
  • Objectives of Catalyst optimizer.
  • Catalyst Library
  • Internal Representation
  • Catalyst Tree
  • Four phases of Catalyst optimization
  • Analysis phase
  • Logical optimization
  • Physical planning
  • Code Generation
Chapter 3: Project Tungsten
  • Introduction to Project Tungsten
  • Code generation
  • CPU Bound operations
Chapter-4: Setting up Spark Environment
  • Compare free vs Paid Version
  • Part-2 Installing Ubuntu Linux on VMWare
  • Part-3: Setting Spark 2.0 env on Ubuntu
Chapter-5 SparkSQL Schema
  • Schema Inference
  • Explicitly assigning schema
  • Schema Inference using reflection
  • Explicitly creating schema using StructType and StructFields
Chapter 6: SparkSQL abstractions & Other Objects
  • About SparkSession.
  • Submitting Spark applications
  • SparkConf object
  • Providing custom rules and optimization technique
  • SparkSQL Row (Catalyst Row) object
  • Resilient Distributed Dataset
  • DataFrame
  • Dataset
  • DataFrame to Dataset conversion
  • Dataset and Type-safety
  • Dataset and Catalyst optimizer
  • Dataset and compile time type safety
  • Working with Dataset
  • Transient
  • Spark Case classes
  • Dataset vs RDD operations
  • Converting an RDD to Dataset
  • Local Datasets
  • Dataset and Project Tungsten
  • Dataset and Encoder
Chapter 7: DataFrameReader and DataFrameWriter
  • Assigning Schema, while reading the Data
  • Handling corrupted records in csv/json file
  • Reading a text file as whole
  • Setting time Zone for the data
  • Reading Data from JDBC data source
  • Filtering Data at source only
  • Reading SparkSQL table as DataFrame
  • DataFrameWriter
  • Partitioning and bucketing
  • Bucketing
  • Data Compressions
  • Columns in Dataset
Chapter 8: SparkSQL and Hive Support
  • Spark SQL and Hive Query Support
  • Hive Metastore
  • Hive Support in SparkSQL
  • Hive Query support using SparkSQL
Chapter 9: SparkSQL and JSON
  • Read JSON data in Spark
  • Example of loading multiple JSON files
  • Explicitly assigning schema to loaded JSON Data
  • Loading JSON data and use SQL query
  • Infer the schema from Data
  • SparkSQL using JSON data full example
Chapter 10: SparkSQL and Encoders
  • Implicit Objects
  • Encoders (Serialization and De-serialization)
  • Creating Encoders
  • Hands on Exercise for SparkSQL Encoders
Chapter 11: Caching and Check-pointing
  • Dataset and Caching
  • SparkSQL and Caching
  • Check-pointing in SparkSQL
  • Types of Checkpoints
  • Caching (disk only) v/s check-pointing
  • Performance Improvements
  • Other important points about checkpointing
Chapter-12: Dataset and Joins
  • Joins Introduction
  • Broadcast Join
  • SparkSQL and Hint
Chapter-13: RelationalGroupedDataset
  • RelationalGroupedDataset
  • Multi Dimension aggregations
  • Dataset Aggregation API
  • Hands on Exercises for Multi-Dimensional Operator
Chapter-14: SparkSQL Functions
  • Spark SQL Functions
  • Standard or User Defined Functions
  • UDF: User Defined Functions
  • Exercise for User Defined Function and User Defined Aggregate Functions
  • Aggregate functions
  • Collection functions
  • About explode function
  • Date and Time Functions
  • Window Aggregate Functions
  • Non-aggregate functions
  • Sorting functions
  • String functions
  • More Window Functions Example
  • Examples of rank and dense_rank functions (Window function)
  • NTILE (Window) function
  • Cumulative Distribution
Chapter-15: Dataset Actions and Transformations.
  • Dataset Partitioning
  • About coalesce operator of Dataset
  • Dataset typed transformations
  • Actions on the Dataset
Chapter-16: Spark Certifications
  • Databricks Certifications
  • How to prepare for Databricks Spark Certifications
  • Cloudera Hadoop and Spark Developer Certifications
  • Hortonworks Spark Certification preparation material
  • MapR Spark Spark Certifications

Note : You can choose more than one product to have custome package created from below and send email to hadoopexam@gmail.com to get discount

All Premium Training Access Annual Subscription (You will get early access to under development training and early edition books) : Used By More than 20000 subscribers

Spark Professional Training   Spark SQL Hands Training   Apache NiFi (Hortonworks DataFlow) Training   Hadoop Professional Training   Cloudera Hadoop Admin Training Course-1  HBase Professional Traininghttp  SAS Base Certification Hands On Training

OOzie Professional Training   AWS Solution Architect : Training Associate   Free Core Java 1Z0-808 Training   Scala Professional Training   Python Professional Training   Book : AWS Solution Architect Associate : Little Guide   NiFi CookBook By HadoopExam

WhatsApp |  Call Us | Have a Query ?  |  Subscribe