Question-81: Is it possible that Cassandra can return the data which is already marked Tombstone?

Answer: Yes, it is possible, because of distributed nature of the database. Tombstone information is not propagated to all the nodes which has replica of that particular record. Also “gc_grace_seconds” is not yet expired, then the data may continue to be returned as live data. 

Question-82: What is the difference between Anti-entropy repair and NodeSync repair?

Answer: Anti-entropy repairs is used for routine maintenance and can be initiated using the nodetool repair command. But NodeSync continuously run in the background and repair replicas.

Question-83: While running Anti-entropy repair, what data structures are used to represent the data?

Answer: Anti-entropy repair is done using Merkle trees, which is a binary hash tree whose leaves are hashes of individual key values. Anti-entropy is a process of comparing the data in all replicas and updating each replica to the newest version. 

  • First build a Merkle tree for each replica
  • Then compare the Merkle trees to discover the differences. 

Question-84: What is chunk cache in Cassandra?

Answer: To increase the performance of the read operation in Cassandra database, SSTable is divided in chunks, and these chunks are saved in memory. This is known as cache of chunks. It can be configured in Cassandra.yaml file by specifying the values to parameter “file_cache_size_in_mb”. 

Question-85: What is the difference between OS page cache and chunk cash?

Answer: OS page cache is sized dynamically by operating system, which can grow and shrink based on the available memory. However, chunk cache is configured statically in Cassandra.yaml file. 

You should not define too small or too big chunk cache size. If you want to change the chunk size then you have to restart the Cassandra node.