www.HadoopExam.com

HadoopExam Learning Resources

What is RCFile ? Apache Hive

RCFile is a Record Columner File used to store data effeciently. It has many advantages while working Apache Hive

1. Efficient Data Storage
2. Fast Query retrival
3. Lesser I/O

RCFile partions your data horizontally based on row and vertically based on column. So while fetching the data it does not have to scan entire table as well as entire row.

Please watch the below video (Part of Hadoop Training /Hadoop Tutorial) to understand in detail.

.What is SerDe ? Hadoop Training , Apache Hive Training .

Apache Hive uses SerDe (and FileFormat) to read and write data from tables.A SerDe is a short name for a Serializer Deserializer.

An important concept behind Hive is that it DOES NOT own the Hadoop File System (HDFS) format that data is stored in. Users can write files to HDFS with whatever tools/mechanism and use Hive to correctly "parse" that file format in a way that can be used by Hive.

So while selecting the data from Apache Hive SerDe.deserialize() method is called and while inserting the data SerDe.serialize() method is called

Please watch the below video (Part of Hadoop Training /Hadoop Tutorial) to understand in detail.