Question-1: Which is the simplest method to find trend and pattern in numerical data?

Answer: You should first get the average value at first from the data, based on the average value you can find most representative data for example from the below where age of 5 people provided.

                31           29           37           33           36

And you can find average value as 166/5 = 33.2

Hence, you can say in this dataset persons are having average age around 33 years. Average as is one of the ways by which you can summarize the data and guess the trend in the data.

 

Question-2: What all are the different ways by which you can calculate the average values?

Answer: There are 3 ways to find the average value mean, mode and median.

 

Question-3: How do you calculate the means?

Answer: Mean is one of the commonest way method to calculate average values as below

                Summation of each element/Number of elements

= 31 + 29+37+33+36/5

=33.2 (This is a mean value)

Hence, mean give us the clue about the sort of people in the club.

µ = ∑X/n

Question-4: Let’s say a HadoopExam Inc has 5 employees with salary as below.

                2000, 2000, 7000, 15000 & 2,000

And you calculate the mean salary which is 5600. Is this correctly representing the average salary?

Answer: There is a 15,000 salary which to away from the average salary, hence does not represent correct average salary. There has to be some another way to calculate the average salary. Because 15,000 is pulling mean upside.

Question-5: What do you mean outliers in the numerical data?

Answer: Any extreme values, which generally does not fit in with the bulk of the data for example below are the ages of people in a group.

                20, 35, 36, 39, 40, 38,39,65

Here, both 20 & 65 are extreme values, which does not fit or near to the available values and called outliers. Here means value is

                Mean = µ = 312/8 =39

Extreme values can be any high and low values that stand out from the rest of the data.