Big Data, Analytics, and the Data Scientist Role (5%) : 3 Questions

  • Define and describe the characteristics of Big Data
  • Describe the business drivers for Big Data analytics and data science
  • Describe the Data Scientist role and related skills  

Data Analytics Lifecycle (8%) : 5 Questions

  • Describe the data analytics lifecycle purpose and sequence of phases
  • Discovery - Describe details of this phase, including activities and associated roles
  • Data preparation - Describe details of this phase, including activities and associated roles
  • Model planning - Describe details of this phase, including activities and associated roles
  • Model building - Describe details of this phase, including activities and associated roles 

Initial Analysis of the Data (15%) : 9 Questions 

  • Explain how basic R commands are used to initially explore and analyze the data
  • Describe and provide examples of the most important statistical measures and effective visualizations of data
  • Describe the theory, process, and analysis of results for hypothesis testing and its use in evaluating a model 

Advanced Analytics - Theory, Application, and Interpretation of Results for Eight Methods (40%) : 24 Questions 

  • K-means clustering
  • Association rules
  • Linear regression
  • Logistic Regression
  • Naïve Bayesian classifiers
  • Decision trees
  • Time Series Analysis
  • Text Analytics 

Advanced Analytics for Big Data - Technology and Tools (22%) : 14 Questions

  • Describe the technological challenges posed by Big Data
  • Describe the nature and use of MapReduce and Apache Hadoop
  • Describe the Hadoop ecosystem and related product use cases
  • Describe in-database analytics and SQL essentials
  • Describe advanced SQL methods: window functions, ordered aggregates, and MADlib  

Operationalizing an Analytics Project and Data Visualization Techniques (10%) : 6 Questions

  • Describe best practices for communicating findings and operationalizing an analytics project
  • Describe best practices for building project presentations for specific audiences
  • Describe best practices for planning and creating effective data visualizations