INFO70279
Statistics for Data Science
Sheridan
 
  I: Administrative Information   II: Course Details   III: Topical Outline(s)  Printable Version
 

Land Acknowledgement

Sheridan College resides on land that has been, and still is, the traditional territory of several Indigenous nations, including the Anishinaabe, the Haudenosaunee Confederacy, the Wendat, and the Mississaugas of the Credit First Nation. We recognize this territory is covered by the Dish with One Spoon treaty and the Two Row Wampum treaty, which emphasize the importance of joint stewardship, peace, and respectful relationships.

As an institution of higher learning Sheridan embraces the critical role that education must play in facilitating real transformational change. We continue our collective efforts to recognize Canada's colonial history and to take steps to meaningful Truth and Reconciliation.


Section I: Administrative Information
  Total hours: 42.0
Credit Value: 3.0
Credit Value Notes: TBD
Effective: Fall 2021
Prerequisites: N/A
Corequisites: N/A
Equivalents: N/A
Pre/Co/Equiv Notes: N/A

Program(s): Data Analyst
Program Coordinator(s): N/A
Course Leader or Contact: N/A
Version: 20210907_00
Status: Approved (APPR)

Section I Notes: Access to course materials and assignments will be available on Sheridan's Learning and Teaching Environment (SLATE). Students will need reliable access to a computer and the internet.

 
 
Section II: Course Details

Detailed Description
Students explore fundamental statistical concepts including working with different types of data, using sampling to make inferences, employing probability to draw conclusions, and performing hypothesis testing to validate results. Students will implement simple and multivariable linear regression, but more importantly, will distinguish the use of these models in statistics vs. machine learning. A/B testing will also be introduced in this course. These concepts will be worked in the software R.

Program Context

 
Data Analyst Program Coordinator(s): N/A
This course is part of the Data Analyst micro-credential


Course Critical Performance and Learning Outcomes

  Critical Performance:
By the end of the course, students will have demonstrated the ability to use various statistical techniques to better understand complex data science problems.
 
Learning Outcomes:

To achieve the critical performance, students will have demonstrated the ability to:

  1. Describe and summarize descriptive statistical analysis using R.
  2. Perform hypothesis testing and explain the calculation of probability for a given dataset.
  3. Recognize the link between probability distributions and statistical decision making.
  4. Apply linear and multilinear regression models and the parameters of interpretations.
  5. Explain the implementation principles and significance of A/B testing in e-commerce.

Evaluation Plan
Students demonstrate their learning in the following ways:

 Evaluation Plan: ONLINE
 Assignment 120.0%
 Assignment 220.0%
 Assignment 320.0%
 Assignment 420.0%
 Assignment 520.0%
Total100.0%

Evaluation Notes and Academic Missed Work Procedure:
TEST AND ASSIGNMENT PROTOCOL The following protocol applies to every course offered by Continuing and Professional Studies. 1. Students are responsible for staying abreast of test dates and times, as well as due dates and any special instructions for submitting assignments and projects as supplied to the class by the instructor. 2. Students must write all tests at the specified date and time. Missed tests, in-class/online activities, assignments and presentations are awarded a mark of zero. The penalty for late submission of written assignments is a loss of 10% per day for up to five business days (excluding Sundays and statutory holidays), after which, a grade of zero is assigned. Business days include any day that the college is open for business, whether the student has scheduled classes that day or not. An extension or make-up opportunity may be approved by the instructor at his or her discretion.

Provincial Context
The course meets the following Ministry of Colleges and Universities requirements:


 

Essential Employability Skills
Essential Employability Skills emphasized in the course:

  • Communication Skills - Respond to written, spoken, or visual messages in a manner that ensures effective communication.
  • Critical Thinking & Problem Solving Skills - Use a variety of thinking skills to anticipate and solve problems.
  • Information Management Skills - Analyze, evaluate, and apply relevant information from a variety of sources.
  • Information Management - Locate, select, organize and document information using appropriate technology and information systems.
  • Personal Skills - Manage the use of time and other resources to complete projects.
  • Personal Skills - Take responsibility for one's own actions, decisions, and consequences.

Prior Learning Assessment and Recognition
PLAR Contact (if course is PLAR-eligible) - Office of the Registrar
Students may apply to receive credit by demonstrating achievement of the course learning outcomes through previous relevant work/life experience, service, self-study and training on the job. This course is eligible for challenge through the following method(s):

  • Challenge Exam
    Notes:  Challenge exam is required

 
 
Section III: Topical Outline
Some details of this outline may change as a result of circumstances such as weather cancellations, College and student activities, and class timetabling.
Instruction Mode: Online
Professor: N/A
Resource(s):
 TypeDescription
RequiredTextbookIntroduction to Probability and Statistics, William Mendenhall; Robert J. Beaver; Barbara M. Beaver, Cengage Learning, 15th Edition, ISBN DIGITAL: 9780357044308, 2020, print ISBNs are 9781337554428, 1337554421
RequiredSoftwareRStudio Open-Source

Applicable student group(s): Students in the online class in the Continuing and Professional Studies.
Course Details:

Module 1: Descriptive Statistics Methods

  • Identify types of variables and scales of measurement
  • Describe and display categorical data
  • Display and summarize quantitative data
  • Calculate measures of location and dispersion

 

Module 2: Data Analysis Using Software

  • Set-up the R environment
  • Explain variables in R
  • Use R codes for vectors, matrices, factors, lists and data frames
  • Use R for manipulating datasets
  • Use R for descriptive statistics
  • Use R for creating charts

Evaluation: Assignment 1: 20%

Practice: Lab 1

 

Module 3: Probability Theory and Real-World Applications

  • Explain the role of probability in statistics
  • Explain events and the sample space
  • Calculate probabilities using simple events
  • Calculate probabilities for unions and complements, independence, conditional probability
  • Apply multiplication rule
  • Explain probability distributions
  • Identify discrete random variables and their probability distributions

 

Module 4: Probability and Normal Distribution

  • Describe probability distributions for continuous random variables
  • Identify the properties of the normal curve
  • Describe the normal probability distribution
  • Calculate the tabulated areas of the normal probability distribution
  • Define the standard normal random variable
  • Evaluate probabilities for a general normal random variable
  • Use R codes for normal distributions

 

Module 5: Sampling Distribution Techniques

  • Explain the statistics of sampling distributions 
  • Describe the central limit theorem 
  • Describe the sampling distribution of the sample mean
  • Describe interval estimation
  • Calculate large-sample confidence interval for a population mean
  • Interpret the confidence interval 
  • Calculate one-sided confidence bound 
  • Calculate sample size 
  • Use R codes for calculating confidence intervals 

 

Module 6: Hypothesis Testing

  • Formulate a hypothesis and apply testing of hypotheses on population parameters for large sample size
  • Select appropriate statistical test of hypothesis (z-test)
  • Evaluate a large-sample test about a population mean for one tail and two tail
  • Explain critical value approach and p-value approach for hypothesis testing
  • Assess two types of errors
  • Evaluate the difference between two means  
  • Apply testing of hypotheses on population parameters for small sample
  • Select appropriate statistical test of hypothesis (t-test)
  • Evaluate a sample test about a population mean for sample size less than 30
  • Calculate p-value using T distribution
  • Evaluate a small sample test of hypothesis for the difference between two population means
  • Use R for hypothesis testing

Evaluation: Assignment 2: 20%

Practice: Lab 2

 

Module 7 Correlation and Regression

  • Apply descriptive statistical methods to data
  • Use statistical analysis software to explore and analyze data
  • Use probability theory to evaluate the probability of real-world events
  • Evaluate the probability of real-world events involving the normal distribution
  • Apply sampling distribution tools and estimation techniques
  • Apply a hypothesis test to data analysis problems
  • Interpret correlation coefficient and regression line equations 

Evaluation: Assignment 3: 20%

Practice: Lab 3

 

Module 8 Statistical Experiments: A/B Testing

  • Examine the importance of A/B testing in e-commerce and the principles of its implementation
  • Understand the challenges of multivariate testing

Evaluation: Assignment 4: 20%

 

Module 9 Multiple Linear Regression Models

  • Understand the subtle differences between using multiple regression models in statistics versus using them in machine learning 
  • Articulate the assumptions of multiple linear regression
  • Interpret the parameters of a multiple regression model
  • Evaluate the performance of regression models

Evaluation: Assignment 5: 20%

Practice: Lab 4

 

 

 



Sheridan Policies

It is recommended that students read the following policies in relation to course outlines:

  • Academic Integrity
  • Copyright
  • Intellectual Property
  • Respectful Behaviour
  • Accessible Learning
All Sheridan policies can be viewed on the Sheridan policy website.

Appropriate use of generative Artificial Intelligence tools: In alignment with Sheridan's Academic Integrity Policy, students should consult with their professors and/or refer to evaluation instructions regarding the appropriate use, or prohibition, of generative Artificial Intelligence (AI) tools for coursework. Turnitin AI detection software may be used by faculty members to screen assignment submissions or exams for unauthorized use of artificial intelligence.

Course Outline Changes: The information contained in this Course Outline including but not limited to faculty and program information and course description is subject to change without notice. Nothing in this Course Outline should be viewed as a representation, offer and/or warranty. Students are responsible for reading the Important Notice and Disclaimer which applies to Programs and Courses.


[ Printable Version ]

Copyright © Sheridan College. All rights reserved.