www.HadoopExam.com

HadoopExam Learning Resources

Question 1 : A data scientist is asked to implement an article recommendation feature for an on-line magazine. The magazine does not want to use client tracking technologies such as cookies or reading history. Therefore, only the style and subject matter of the current article is available for making recommendations. All of the magazine's articles are stored in a database in a format suitable for analytics. Which method should the data scientist try first?
1.  K Means Clustering
2.  Naive Bayesian
3.  Logistic Regression
4.  Association Rules

Correct Answer 1

Exp : kmeans uses an iterative algorithm that minimizes the sum of distances from each object to its cluster centroid, over all clusters. This algorithm moves objects between clusters until the sum cannot be decreased further. The result is a set of clusters that are as compact and well-separated as possible. You can control the details of the minimization using several optional input parameters to kmeans, including ones for the initial values of the cluster centroids, and for the maximum number of iterations. Clustering is primarily an exploratory technique to discover hidden structures of the data, possibly as a prelude to more focused analysis or decision processes. Some specific applications of k-means are image processing, medical, and customer egmentation. Clustering is often used as a lead-in to classification. Once the clusters are identified, labels can be applied to each cluster to classify each group based on its characteristics. Marketing and sales groups use k-means to better identify customers who have similar behaviors and spending patterns.

You have no rights to post comments

Comments   

+1 # Cleared EMC20-007 Data Science CertificationJitendra Swami 2015-04-29 16:43
Team, thank you very for the questions. I have cleared my emc data science certification. Perfect material, so many questions. It was very good to build my confidence. I will definitely refer to my friends.
+1 # RE: Cleared EMC20-007 Data Science CertificationHadoopExam Learning 2015-04-29 16:46
Many congratulations Jitendra, and hope you will refer to your friends about our quality material. Please keep visiting www.HadoopExam.com we keep adding new materials everyday. Which help you to learn more.
You are here: Home EMC Certification EMC Data Science EMC Data Science Question 1 : A data scientist is asked to implement an article recommendation feature for an on-line magazine. The magazine does not want to use client tracking technologies such as cookies or reading history. Therefore, only the style an