INFO70280
Data Cleansing
Sheridan
 
  I: Administrative Information   II: Course Details   III: Topical Outline(s)  Printable Version
 
Section I: Administrative Information
  Total hours: 42.0
Credit Value: 3.0
Credit Value Notes: TBD
Effective: Spring/Summer 2021
Prerequisites: INFO70279
Corequisites: N/A
Equivalents: N/A
Pre/Co/Equiv Notes: N/A

Program(s): Data Analyst
Program Coordinator(s): N/A
Course Leader or Contact: N/A
Version: 20210517_01
Status: Approved (APPR)

Section I Notes: Access to course materials and assignments will be available on Sheridan's Learning and Teaching Environment (SLATE). Students will need reliable access to a computer and the internet.

 
 
Section II: Course Details

Detailed Description
Students explore the steps in data analysis/transformation, data cleansing, and data modeling to build Business Intelligence (BI) dashboards in Power BI. Students use recommended Microsoft SQL Server Management Studio and Power BI tools, as well as the open source language R, to track and build business KPIs and manage data tasks. With the sample dataset, students will be able to develop their own Business Intelligence dashboard that could be used as a potential resource in a job interview.

Program Context

 
Data Analyst Program Coordinator(s): N/A
This course is part of the Data Analyst micro-credential


Course Critical Performance and Learning Outcomes

  Critical Performance:
By the end of this course, students will have the ability to clean, model, and analyze relational sample data using SQL and R language, visualized in Power BI dashboards, to create critical business KPIs and measures using DAX scripts.
 
Learning Outcomes:

To achieve the critical performance, students will have demonstrated the ability to:

  1. Explain the function of different data processing systems such as OLTP, ROLAP, MOLAP and Hybrids.
  2. Construct SQL queries for both basic and advanced levels of Merge/Append, Aggregations and Analytic queries.
  3. Develop stored procedures using an SQL server.
  4. Transform ("cleanse") raw data using SQL and R language.
  5. Group and perform data normalizations.
  6. Build a standard data model using the star and snowflake schemas.
  7. Create BI dashboards and performance measures in Power BI using DAX scripts.

Evaluation Plan
Students demonstrate their learning in the following ways:

 Evaluation Plan: ONLINE
 Quizzes (2 x 5 %)10.0%
 Assignment 135.0%
 Assignment 235.0%
 Assignment 320.0%
Total100.0%

Evaluation Notes and Academic Missed Work Procedure:
TEST AND ASSIGNMENT PROTOCOL The following protocol applies to every course offered by Continuing and Professional Studies. 1. Students are responsible for staying abreast of test dates and times, as well as due dates and any special instructions for submitting assignments and projects as supplied to the class by the instructor. 2. Students must write all tests at the specified date and time. Missed tests, in-class/online activities, assignments and presentations are awarded a mark of zero. The penalty for late submission of written assignments is a loss of 10% per day for up to five business days (excluding Sundays and statutory holidays), after which, a grade of zero is assigned. Business days include any day that the college is open for business, whether the student has scheduled classes that day or not. An extension or make-up opportunity may be approved by the instructor at his or her discretion.

Provincial Context
The course meets the following Ministry of Colleges and Universities requirements:


 

Essential Employability Skills
Essential Employability Skills emphasized in the course:

  • Communication Skills - Respond to written, spoken, or visual messages in a manner that ensures effective communication.
  • Critical Thinking & Problem Solving Skills - Use a variety of thinking skills to anticipate and solve problems.
  • Information Management Skills - Analyze, evaluate, and apply relevant information from a variety of sources.
  • Information Management - Locate, select, organize and document information using appropriate technology and information systems.
  • Personal Skills - Manage the use of time and other resources to complete projects.
  • Personal Skills - Take responsibility for one's own actions, decisions, and consequences.

Prior Learning Assessment and Recognition
PLAR Contact (if course is PLAR-eligible) - Office of the Registrar

  • Not Eligible for PLAR

 
 
Section III: Topical Outline
Some details of this outline may change as a result of circumstances such as weather cancellations, College and student activities, and class timetabling.
Instruction Mode: Online
Professor: N/A
Resource(s):
 TypeDescription
OptionalOtherRecommended Reading: The Definitive Guide to DAX Business intelligence with Microsoft Excel, SQL Server Analysis Services, and Power BI

Applicable student group(s): Students in the online class in the Continuing and Professional Studies.
Course Details:

Module 1: Introduction to Relational Databases 

  • OLTP vs. OLAP and their variations
  • Relational engines and their frameworks 

               

Module 2: SQL Scripting: Tables and Relational Databases 

  • SQL Scripting to create tables
  • Populating and updating tables
  • Changing the structure of a table
     

Module 3: SQL Scripting: JOINS, Stored Procedures  and Analytics

  • Types of Table JOINS and UNIONS
  • Building Stored Procedures
  • SQL Analytics – Rank, Row Number, Lead Lag, First Value and Last Value

(Quiz #1: 5%)

(Assignment #1: 35%)

 

Module 4 : Entity Relationship Diagram (ERD)

  • Normalization of Data
  • Building entity relationships using their keys constraints (PK/FK) 
  • Introduction to star and snowflake schemas 
  • Different Cardinalities and Dynamic Cardinalities

 

Module 5 : Business Intelligence Dashboard / Data Visualisation

  • Building a Dashboard using Power BI Desktop 
  • Introduction to DAX (Data Analytic Xpressions)
  • Read data from a local server and from web transform

 

Module 6 : Building KPI’s and Measures

  • Building DAX scripts
  • Performance analysis of scripts
  • Most common functions in DAX

(Assignment #2: 35%)     

 

Module 7: Data Cleansing using R 

  • Types of data quality issues/checklists
  • Methods to handle data quality problems
  • Substitution and Imputation in R

(Assignment #3: 20%)

(Quiz # 2: 5%)

 



Sheridan Policies

All Sheridan policies can be viewed on the Sheridan policy website.

Academic Integrity: The principle of academic integrity requires that all work submitted for evaluation and course credit be the original, unassisted work of the student. Cheating or plagiarism including borrowing, copying, purchasing or collaborating on work, except for group projects arranged and approved by the professor, or otherwise submitting work that is not the student's own, violates this principle and will not be tolerated. Students who have any questions regarding whether or not specific circumstances involve a breach of academic integrity are advised to review the Academic Integrity Policy and procedure and/or discuss them with the professor.

Copyright: A majority of the course lectures and materials provided in class and posted in SLATE are protected by copyright. Use of these materials must comply with the Acceptable Use Policy, Use of Copyright Protected Work Policy and Student Code of Conduct. Students may use, copy and share these materials for learning and/or research purposes provided that the use complies with fair dealing or an exception in the Copyright Act. Permission from the rights holder would be necessary otherwise. Please note that it is prohibited to reproduce and/or post a work that is not your own on third-party commercial websites including but not limited to Course Hero or OneNote. It is also prohibited to reproduce and/or post a work that is not your own or your own work with the intent to assist others in cheating on third-party commercial websites including but not limited to Course Hero or OneNote.

Intellectual Property: Sheridan's Intellectual Property Policy generally applies such that students own their own work. Please be advised that students working with external research and/or industry collaborators may be asked to sign agreements that waive or modify their IP rights. Please refer to Sheridan's IP Policy and Procedure.

Respectful Behaviour: Sheridan is committed to provide a learning environment that supports academic achievement by respecting the dignity, self-esteem and fair treatment of every person engaged in the learning process. Behaviour which is inconsistent with this principle will not be tolerated. Details of Sheridan's policy on Harassment and Discrimination, Academic Integrity and other academic policies are available on the Sheridan policy website.

Accessible Learning: Accessible Learning coordinates academic accommodations for students with disabilities. For more information or to register, please see the Accessible Learning website (Statement added September 2016)

Course Outline Changes: The information contained in this Course Outline including but not limited to faculty and program information and course description is subject to change without notice. Any changes to course curriculum and/or assessment shall adhere to approved Sheridan protocol. Nothing in this Course Outline should be viewed as a representation, offer and/or warranty. Students are responsible for reading the Important Notice and Disclaimer which applies to Programs and Courses.


[ Printable Version ]

Copyright © Sheridan College. All rights reserved.