HadoopExam Learning Resources

 

BigData | DataScience | IOT | Cloud | DevOps | ITRisk | AI |BlockChain 

 

Green


    25000+ Learners upgraded/switched career    Testimonials

All Certifications preparation material is for renowned vendors like Cloudera, MapR, EMC, Databricks,SAS, Datastax, Oracle, NetApp etc , which has more value, reliability and consideration in industry other than any training institutional certifications.
Note : You can choose more than one product to have custome package created from below and send email to hadoopexam@gmail.com to get discount.


Do you know?
Hadoop Annual Subscription

PySpark 2.X (Using Python)  Professional Training with Hands On Sessions  :
  • In total 21+ Modules and 8+ Hrs 
  • Hands On Exercises
  • Core concepts and fundamentals are covered
  • Each Video with Hand Written PDFs and cover important concepts
Access Training from this Link

Video Demo Sessions


 
   
         Python is Must language for Data Scientists, Business Analytics and Data Engineer. Hence, Apache has to develop and support there most active project in the Python Progamming language as well. And so the PySpark is developed. All the professionals who are working on the Python programming language do not have to learn a new Programming Language to work with the most active framework. We have been rceiving lot many request for the PySpark training, because of our most successful Spark training in Scala. We decided to work on the PySpark as well and provide even more better quality training in PySpark. Hence, we are very happy to provide this PySpark core training to all our professionals who were waiting for this training to be launched.
Hence, if you are already working in the Python then dont miss this training, this is a must learn technology for you. We provide training in such a easy way as well as equally having the focus on core concept as well. Please subscribe this training now... If you are looking annual subscription for all the trainings and books available with the HadoopExam.com then visit this page (Annual Subscription). We have more 50000 subscriber across all our product which proves the quality of the material. This training is most usful for following professional
  • Developer
  • Data Engineer
  • Data Analytics
  • Data Scientists 
In this training we are focusing on both fundamental concepts as well as Hands on Exercises. We start with the concepts, and than set up the environment using Ubuntu Linux (On Window’s VMWare), same instructions you can follow for Mac OS as well. Once we created the environment we will be covering many Hands On Exercises, which will make you expert for the PySpark. As of now total training length is 6+ Hours. You can watch the above demo sessions as well to check the quality of the training. As you know, Spark 2.x certification also released by Databricks and they have separate certification material for the Python professional.

Below is the best combo for this certification (
Pack3TRNDBSparkPython)

  1. Apache PySpark (Python) Professional Training (Core Saprk and Fundamentals)
  2. PySpark(Python) Structured Streaming  Professional Training with HandsOn
  3. PR000022 : Databricks Certified Associate Developer for Apache Spark 3.x - Python Certification

PySpark : HandsOn Professional Training +PySpark Structured Streaming+PR000022: PySpark Databricks Certification Spark 3   

Regular Price : $400
Offer Price: $209.00
 (Save Flat 50%) + Additional $41 discount for next 3 days = $159  


Note: If having trouble while credit card payment then please create PayPal account and then pay.
India Bank Transfer
Regular Price: 16949 INR
Offer Price: 8475INR (Save flat 50%) +18%GST  = 9999INR       Additional discount (2000INR) for next 3 days = 7999INR only
 
Click Below ICICI Bank Acct. Detail
 
Indian credit and Debit Card(PayuMoney)  

Subscribe PySpark (Python ) Professional Training  

Contact Us After Buying To Get Full Access 

admin@hadoopexam.com
hadoopexam@gmail.com
Phone : 022-42669636
Mobile : +91-8879712614

Regular Price : $179
Offer Price: $89.00
 (Save Flat 50% ) + Additional $14 discount for next 3 days = $75

Note: If having trouble while credit card payment then please create PayPal account and then pay.
India Bank Transfer
Regular Price: 9999 INR
Offer Price: 4237INR (Save flat 50%) +18%GST = 4999INR + Additonal 900INR discount for next 3 days= 3599INR Only
 
Click Below ICICI Bank Acct. Detail
 
Indian credit and Debit Card(PayuMoney)

Most subscribed Annual Package  (This training is Part of our Annual Subscription as well)

Hadoop Annual Subscription


Syllabus Covered as Part of This training (Become an Spark 2.x SQL Expert in around 8+ hours training

Module-1: Apache Spark Introduction (PDF Download & Available Length 22 Minutes)
  •   Spark v/s MapReduce
  •   Why Hadoop to be used?
  •   HDFS and YARN Intro
Module-2 : Spark and Hadoop Performance Difference (PDF Download & Available Length 24 Minutes)
  • Introduction to Iterative algorithm
  • Multiple Reasons behind Spark High Performance
  • RDD : Native Spark API intro
Module-3 : Spark Architecture (PDF Download & Available Length 17 Minutes )
  •  Cluster : Group of Computers
  •  Spark Application Components
  • Driver
  • Executors
  • Cluster Manager
  •    SparkSession
  •    Transformation
  • Lazy Evaluation
  • Narrow Transformation
  • Wide Transformation
Module-4: Apache Spark Introduction DataFrame (PDF Download & Available Length 21 Minutes )
  • DataFrame
  • DataFrame v/s Dataset
  • Sample API for DataFrame
  • Language Independent Catalyst Optimizer
Module-5A : Install VMWARE Workstation Player   PDF Download & Available Length 8 Minutes & HandsOn )
Module-5B : Install Ubuntu Linux in VMWare Player 
PDF Download & Available Length 23 Minutes & HandsOn )
  • Install Ubuntu Image
  • Install SSH server
  • Install Putty and connect to Linux OS
Module-5C : Install Apache Spark PDF Download & Available Length 17 Minutes & HandsOn )
  • Install Apache Spark
  • Start spark-shell
  • Start pyspark
Module-6: Apache Spark Understanding RDD (PDF Download & Available Length 18 Minutes  )
  • About RDD
  • RDD V/s DataFrame v/s Dataset
  • RDD and Custom Partitioner concept
Module-7: Introduction Apache Spark Catalyst optimizer (PDF Download & Available Length 38 Minutes )
  • What is Catalyst optimizer
  • Concepts of Tree and Rules
  • Various Phases of Catalyst optimizer
  • Analysis
  • Logical optimization
  • Physical planning
  • Code Generation
  •     Scala Features concepts
  • Predicate Pushdown
  • Constant Folding
  • Physical operator
  • Project Prunning
Module-8 : Apache Spark DataFrame & Dataset API (PDF Download & Available Length 19 Minutes )
  • Direct Acyclic Graph
  • DataFrame v/s Dataset
  • Explicit Schema for DataFrame/Dataset
  • Columns in DataFrame
  • Execution Path and Execution steps
  • Runtime Optimizations
Module-9: Working with Structured API (PDF Download & Available Length 35 Minutes & HandsOn )
  • Schema, StructType and StructFields
  • Manual Schema Assignment
  • Creating and selecting columns
Module-10 : Working with Structured API (PDF Download & Available Length 19 Minutes & HandsOn )
  • Creating Rows
  • expr and selectExpr
  • Basics of Literals
  • Hands on Exercise
  • Unique Rows
  • Explicit Assign Schema
  • Working with columns and Rows
  • Sorting Data
  • Union of Rows
  • Limit
  • Repartition and Coalesce
  • Collecting Rows on the Driver
Module-11 : Working with Spark DataTypes and User Defined Function (PDF Download & Available Length 29 Minutes & HandsOn )
  • Spark has its own DataTypes
  • Boolean Expression (True/False)
  • Serially Define the filter
  • Working with Numerical Data
Module-12 : Working with Spark DataTypes and User Defined Function (PDF Download & Available Length 19 Minutes & HandsOn )
  •   Works with Character Data
  •   Using Regular Expression
  •   Dates and Timestamp
Module-13 : Working with Spark DataTypes and User Defined Function (PDF Download & Available Length 22 Minutes & HandsOn )
  • Struct Data Type
  • Array Data Types
  • Explode Example
  • Map Types
  • User Defined Function
Module-14 : DataFrame Grouping and Aggregations (PDF Download & Available Length 33 Minutes & HandsOn )
  • DataFrame and GroupBy operation
  • Understanding RelationalGroupedDataset
  • Basic Aggregation Operation
  • Working with Complex DataTypes
Module-15 : DataFrame Grouping and Aggregations (PDF Download & Available Length 26 Minutes & HandsOn )
  • Understand Window Function
  • Grouping Set function
  • Pivot
  • Cubes
Module-16: PySpark and Joins (PDF Download & Available Length 28 Minutes & HandsOn )
  • Inner Join
  • Left Outer Join
  • Right Outer Join
  • Full Outer Join
Module-17: PySpark and Joins (PDF Download & Available Length 13 Minutes & HandsOn )
  • Left Semi Join
  • Left Anti Join
  • Shuffle Join
  • Broadcast Join
Module-18A : Understand RC and ORC File Types (PDF Download & Available Length 9 Minutes  )

Module-18B: Read and Write Data + File Formats
(PDF Download & Available Length 23 Minutes & HandsOn )
  • Understanding with the DataFrameReader
  • Various Data Read Modes
  • Permissive , Drop malformed , FailFast
  • Working with the DataFrameWriter
  • Save Modes
  • Append , Overwrite , Ignore , errorIfExists
  • HandsOn Exercises
  • String, Date and Timestamp
  • Working with Fields separator
  • Generating and working with the file formats (Read and Write as well)
  • ORC File
  • Parquet File
  • Json
  • Csv
  • Text
Module-19: Spark App on the Cluster (PDF Download & Available Length 21 Minutes )
  •     Spark Driver Process
  •     Spark Executors
  •     Cluster Manager
  •     Various Execution Modes
  • Cluster Mode
  • Client Mode
  • Local Mode
Module-20: Spark App on the Cluster (PDF Download & Available Length 15 Minutes )
  •     Submit application and its Flow
  •     Spark App Understanding in Depth
  •     Application, Job, Stage and Task
  •     Spark Shuffle
  •     Tasks
  •     Pipeline
Module 21 : SPARK ADVANCED : DATA PARTITIONING (PDF Download & Available Length 26 Minutes )
  • What is Partitioning and why?
  • Data Partitioning example using Join (Hash Partitioning)
  • Understand Partitioning using Example for get Recommendations for Customer
  • Understand Partitioning code using Spark-Scala
  • Operations which create Partitioned RDD
  • Operation which get benefit of Partitioning
  • Operation that affect the partitioning

* Please read faq section carefully.

Note : You can choose more than one product to have custome package created from below and send email to hadoopexam@gmail.com to get discount

Click to View What Learners Say about us : Testimonials

We have training subscriber from TCS, IBM, INFOSYS, ACCENTURE, APPLE, HEWITT, Oracle , NetApp , Capgemini etc


WhatsApp |  Call Us | Have a Query ?  |  Subscribe          


End of Page

End of Page