I am working in Spark from many years and I know RDD API well, what should I use in exam?
Answer: Many of our learners are getting confused with the RDD API is being part of the syllabus. And as a programmer it is always recommended to you by the Spark community that you should avoid using the RDD in your program if you are already on the Spark 2.x or later version and should use the SparkSQL API or DataFrame/Dataset API. Why?
Yes, that is true as much as possible you should avoid using RDD in your program, if you are already using Spark 2.x version.
You should try to avoid using RDD API if possible, in your program until and unless it is absolutely necessary, for example with the broadcast variable and accumulator you have to use this.