Question 9: Fill in blanks

Each time the memory structure is full, the data is written to disk in an SSTable data file. All writes are automatically partitioned and replicated throughout the cluster. Cassandra periodically consolidates SSTables using a process called __________, discarding obsolete data marked for deletion with a ___________.

To ensure all data across the cluster stays consistent, various repair mechanisms are employed.

 

  1. Compaction, tombstone
  2. Tombstone, compaction
  3. Serialization, compaction
  4. Repair, compaction
  5. Tombstone, repair

Correct Answer: 1

Explanation: Cassandra is designed to handle big data workloads across multiple nodes with no single point of failure. Its architecture is based on the understanding that system and hardware failures can and do occur. Cassandra addresses the problem of failures by employing a peer-to-peer distributed system across homogeneous nodes where data is distributed among all nodes in the cluster. Each node frequently exchanges state information about itself and other nodes across the cluster using peer-to-peer gossip communication protocol. A sequentially written commit log on each node captures write activity to ensure data durability. Data is then indexed and written to an in-memory structure, called a memtable, memtable which resembles a write-back cache. Each time the memory structure is full, the data is written to disk in an SSTable data file. All writes are automatically partitioned and replicated throughout the cluster.

Cassandra periodically consolidates SSTables using a process called compaction, discarding obsolete data marked for deletion with a tombstone. To ensure all data across the cluster stays consistent, various repair mechanisms are employed.