Question-18: Which of the following is applicable when you design your table?

  1. One of the nodes in the cluster should have all the data from all the remaining nodes in the cluster.
  2. Each node in the cluster should have roughly equal amount of data.
  3. Partition key should be the first column, while defining primary key.
  4. While reading data, you should try that you read data from as more partitions as possible.

Ans: B, C

Exp: You should try that each node in the cluster have roughly equal amount of data, so that cluster remain balanced. While defining primary key, have to check that first column in the primary key is same as partition key.

Partitions are group of rows that share the same partition key. When you issue a read query, it should read rows from as few partitions as possible.

Each partition may reside on a different node in the cluster. And the coordinator node generally need to issue separate commands to separate nodes for each partition you request.  And it leads to overhead and latency. Even if you are using single node cluster than also it is expensive to read data from across the partitions.