Why people say, keep Spark API by heart for this CRT020 exam?
Answer: As we suggested before, because we have seen learners are not able to complete the assessment exam on time. Because they spend more time on the documentation. We are again suggesting please memorize the API as much as possible, specially the things which are frequently used. Like Row, DataFrame, Select, filter, distinct, foreach, take, persist, format, load, StructType, StructField etc.
Memorize how to set the properties like “spark.sql.shuffle.partions” how it is set on SparkSession or SparkContext. Soon HadoopExam would be creating quick reference or revision notes the same.