Course Description
Extremely HandsOn… Incredibly Practical… Unbelievably Real!
This is not one of those fluffy classes where everything works out just the way it should and your training is smooth sailing. This course throws you into the deep end.
In this course you WILL experience firsthand all of the PAIN a Data Scientist goes through on a daily basis. Corrupt data, anomalies, irregularities – you name it!
This course will give you a full overview of the Data Science journey. Upon completing this course you will know:
How to clean and prepare your data for analysis
How to perform basic visualisation of your data
How to model your data
How to curvefit your data
And finally, how to present your findings and wow the audience
This course will give you so much practical exercises that real world will seem like a piece of cake when you graduate this class. This course has homework exercises that are so thought provoking and challenging that you will want to cry… But you won’t give up! You will crush it. In this course you will develop a good understanding of the following tools:
SQL
SSIS
Tableau
Gretl
This course has preplanned pathways. Using these pathways you can navigate the course and combine sections into YOUR OWN journey that will get you the skills that YOU need.
Or you can do the whole course and set yourself up for an incredible career in Data Science.
The choice is yours. Join the class and start learning today!
See you inside,
Sincerely,
Kirill Eremenko
What are the requirements?
Only a passion for success
All software used in this course is either available for Free or as a Demo version
What am I going to get from this course?
Successfully perform all steps in a complex Data Science project
Create Basic Tableau Visualisations
Perform Data Mining in Tableau
Understand how to apply the ChiSquared statistical test
Apply Ordinary Least Squares method to Create Linear Regressions
Assess RSquared for all types of models
Assess the Adjusted RSquared for all types of models
Create a Simple Linear Regression (SLR)
Create a Multiple Linear Regression (MLR)
Create Dummy Variables
Interpret coefficients of an MLR
Read statistical software output for created models
Use Backward Elimination, Forward Selection, and Bidirectional Elimination methods to create statistical models
Create a Logistic Regression
Intuitively understand a Logistic Regression
Operate with False Positives and False Negatives and know the difference
Read a Confusion Matrix
Create a Robust Geodemographic Segmentation Model
Transform independent variables for modelling purposes
Derive new independent variables for modelling purposes
Check for multicollinearity using VIF and the correlation matrix
Understand the intuition of multicollinearity
Apply the Cumulative Accuracy Profile (CAP) to assess models
Build the CAP curve in Excel
Use Training and Test data to build robust models
Derive insights from the CAP curve
Understand the Odds Ratio
Derive business insights from the coefficients of a logistic regression
Understand what model deterioration actually looks like
Apply three levels of model maintenance to prevent model deterioration
Install and navigate SQL Server
Install and navigate Microsoft Visual Studio Shell
Clean data and look for anomalies
Use SQL Server Integration Services (SSIS) to upload data into a database
Create Conditional Splits in SSIS
Deal with Text Qualifier errors in RAW data
Create Scripts in SQL
Apply SQL to Data Science projects
Create stored procedures in SQL
Present Data Science projects to stakeholders
What is the target audience?
Anybody with an interest in Data Science
Anybody who wants to improve their data mining skills
Anybody who wants to improve their statistical modelling skills
Anybody who wants to improve their data preparation skills
Anybody who wants to improve their Data Science presentation skills
Section 1: Get Excited  

Lecture 1 
Welcome to Data Science AZ™

04:41  
Section 2: What is Data Science?  
Lecture 2 
Intro (what you will learn in this section)

00:44  
Lecture 3 
Profession of the future

06:58  
Lecture 4 
Areas of Data Science

05:58  
Lecture 5 
IMPORTANT: Course Pathways

05:52  
Section 3: ————————— Part 1: Visualisation —————————  
Lecture 6 
Welcome to Part 1

01:57  
Section 4: Introduction to Tableau  
Lecture 7 
Intro (what you will learn in this section)

00:28  
Lecture 8 
Installing Tableau Desktop and Tableau Public (FREE)

06:08  
Lecture 9 
Challenge description + view data in file

02:32  
Lecture 10 
Connecting Tableau to a Data file – CSV file

05:17  
Lecture 11 
Navigating Tableau – Measures and Dimensions

08:42  
Lecture 12 
Creating a calculated field

06:14  
Lecture 13 
Adding colours

07:37  
Lecture 14 
Adding labels and formatting

11:00  
Lecture 15 
Exporting your worksheet

07:40  
Lecture 16 
Section Recap

03:34  
Quiz 1 
Tableau Basics

5 questions  
Section 5: How to use Tableau for Data Mining  
Lecture 17 
Intro (what you will learn in this section)

00:44  
Lecture 18 
Get the Dataset + Project Overview

07:12  
Lecture 19 
Connecting Tableau to an Excel File

03:56  
Lecture 20 
How to visualise an adhoc AB test in Tableau

06:29  
Lecture 21 
Working with Aliases

04:05  
Lecture 22 
Adding a Reference Line

04:53  
Lecture 23 
Looking for anomalies

08:35  
Lecture 24 
Handy trick to validate your approach / data

09:13  
Lecture 25 
Section Recap

05:04  
Section 6: Advanced Data Mining With Tableau  
Lecture 26 
Intro (what you will learn in this section)

00:44  
Lecture 27 
Creating bins & Visualizing distributions

09:55  
Lecture 28 
Creating a classification test for a numeric variable

04:25  
Lecture 29 
Combining two charts and working with them in Tableau

08:31  
Lecture 30 
Validating Tableau Data Mining with a ChiSquared test

10:29  
Lecture 31 
ChiSquared test when there is more than 2 categories

08:15  
Lecture 32 
Visualising Balance and Estimated Salary distribution

11:04  
Lecture 33 
Bonus: ChiSquared Test (Stats Tutorial)

19:12  
Lecture 34 
Bonus: ChiSquared Test Part 2 (Stats Tutorial)

09:10  
Lecture 35 
Section Recap

05:44  
Lecture 36 
Part Completed

01:38  
Section 7: ————————— Part 2: Modelling —————————  
Lecture 37 
Welcome to Part 2

03:54  
Section 8: Stats Refresher  
Lecture 38 
Intro (what you will learn in this section)

00:29  
Lecture 39 
Types of variables: Categorical vs Numeric

05:26  
Lecture 40 
Types of regressions

08:09  
Lecture 41 
Ordinary Least Squares

03:11  
Lecture 42 
Rsquared

05:11  
Lecture 43 
Adjusted Rsquared

09:56  
Section 9: Simple Linear Regression  
Lecture 44 
Intro (what you will learn in this section)

00:37  
Lecture 45 
Introduction to Gretl

02:34  
Lecture 46 
Get the dataset

04:03  
Lecture 47 
Import data and run descriptive statistics

04:25  
Lecture 48 
Reading Linear Regression Output

06:48  
Lecture 49 
Plotting and analysing the graph

04:22  
Section 10: Multiple Linear Regression  
Lecture 50 
Intro (what you will learn in this section)

01:15  
Lecture 51 
Caveat: assumptions of a linear regression

01:47  
Lecture 52 
Get the dataset

04:12  
Lecture 53 
Dummy Variables

08:05  
Lecture 54 
Dummy Variable Trap

02:10  
Lecture 55 
Ways to build a model: BACKWARD, FORWARD, STEPWISE

15:41  
Lecture 56 
Backward Elimination – Practice time

16:08  
Lecture 57 
Using Adjusted Rsquared to create Robust models

10:17  
Lecture 58 
Interpreting coefficients of MLR

12:47  
Lecture 59 
Section Recap

04:15  
Section 11: Logistic Regression  
Lecture 60 
Intro (what you will learn in this section)

01:34  
Lecture 61 
Get the dataset

04:13  
Lecture 62 
Binary outcome: Yes/NoType Business Problems

09:09  
Lecture 63 
Logistic regression intuition

17:03  
Lecture 64 
Your first logistic regression

08:04  
Lecture 65 
False Positives and False Negatives

08:01  
Lecture 66 
Confusion Matrix

04:57  
Lecture 67 
Interpreting coefficients of a logistic regression

10:03  
Section 12: Building a robust geodemographic segmentation model  
Lecture 68 
Intro (what you will learn in this section)

01:01  
Lecture 69 
Get the dataset

07:32  
Lecture 70 
What is geodemographic segmenation?

05:05  
Lecture 71 
Let’s build the model – first iteration

08:26  
Lecture 72 
Let’s build the model – backward elimination: STEPBYSTEP

11:11  
Lecture 73 
Transforming independent variables

10:09  
Lecture 74 
Creating derived variables

06:09  
Lecture 75 
Checking for multicollinearity using VIF

08:11  
Lecture 76 
Correlation Matrix and Multicollinearity Intuition

08:20  
Lecture 77 
Model is Ready and Section Recap

06:27  
Section 13: Assessing your model  
Lecture 78 
Intro (what you will learn in this section)

00:37  
Lecture 79 
Accuracy paradox

02:11  
Lecture 80 
Cumulative Accuracy Profile (CAP)

11:16  
Lecture 81 
How to build a CAP curve in Excel

14:47  
Lecture 82 
Assessing your model using the CAP curve

07:11  
Lecture 83 
Get my CAP curve template

06:20  
Lecture 84 
How to use test data to prevent overfitting your model

03:34  
Lecture 85 
Applying the model to test data

08:09  
Lecture 86 
Comparing training performance and test performance

11:33  
Lecture 87 
Section Recap

03:33  
Section 14: Drawing insights from your model  
Lecture 88 
Intro (what you will learn in this section)

00:34  
Lecture 89 
Power insights from your CAP

13:52  
Lecture 90 
Coefficients of a Logistic Regression – Plan of Attack (advanced topic)

03:47  
Lecture 91 
Odds ratio (advanced topic)

08:29  
Lecture 92 
Odds Ratio vs Coefficients in a Logistic Regression (advanced topic)

07:08  
Lecture 93 
Deriving insights from your coefficients (advanced topic)

13:15  
Lecture 94 
Section Recap

03:26  
Section 15: Model maintenance  
Lecture 95 
Intro (what you will learn in this section)

00:37  
Lecture 96 
What does model deterioration look like?

04:36  
Lecture 97 
Why do models deteriorate?

15:26  
Lecture 98 
Three levels of maintenance for deployed models

08:21  
Lecture 99 
Section Recap

01:38  
Section 16: ————————— Part 3: Data Preparation —————————  
Lecture 100 
Welcome to Part 3

02:24  
Section 17: Business Intelligence (BI) Tools  
Lecture 101 
Intro (what you will learn in this section)

00:23  
Lecture 102 
Working with Data

01:15  
Lecture 103 
What is a Data Warehouse? What is a Database?

03:28  
Lecture 104 
Setting up Microsoft SQL Server 2014 for practice

08:05  
Lecture 105 
Important: Practice Database

09:44  
Lecture 106 
ETL for Data Science – what is Extract Transform Load (ETL)?

02:01  
Lecture 107 
Microsoft BI Tools: What is SSDTBI and what are SSIS/SSAS/SSRS ?

04:04  
Lecture 108 
Installing SSDT with MSVS Shell

04:24  
Section 18: ETL Phase 1: Data Wrangling before the Load  
Lecture 109 
Intro (what you will learn in this section)

00:48  
Lecture 110 
Preparing your folder structure for your Data Science project

02:20  
Lecture 111 
Download the dataset for this section

01:27  
Lecture 112 
Two things you HAVE to do before the load

04:56  
Lecture 113 
Notepad ++

01:00  
Lecture 114 
Editpad Lite

01:11  
Section 19: ETL Phase 2: Stepbystep guide to uploading data using SSIS  
Lecture 115 
Intro (what you will learn in this section)

00:50  
Lecture 116 
Starting and navigating an SSIS Project

01:46  
Lecture 117 
Creating a flat file source task and OLE DB destination

01:53  
Lecture 118 
Setting up your flat file source connection

06:08  
Lecture 119 
Setting up your database connection and creating a RAW table

07:43  
Lecture 120 
Run the Upload & Disable

02:39  
Lecture 121 
Due Dilligence: Upload Quality Assurance

02:02  
Section 20: Handling errors during ETL (Phases 1 & 2)  
Lecture 122 
Intro (what you will learn in this section)

00:50  
Lecture 123 
Download the dataset for this section

00:46  
Lecture 124 
How excel can mess up your data

03:46  
Lecture 125 
Bulletproof Blueprint for Data Wrangling before the Load

07:13  
Lecture 126 
SSIS Error: Text qualifier not specified

07:15  
Lecture 127 
What do you do when your source file is corrupt? (Part 1)

18:01  
Lecture 128 
What do you do when your source file is corrupt? (Part 2)

06:09  
Lecture 129 
SSIS Error: Data truncation

15:56  
Lecture 130 
Handy trick for finding anomalies in SQL

03:45  
Lecture 131 
Automating Error Handling in SSIS: Conditional Split

08:20  
Lecture 132 
Automating Error Handling in SSIS: Conditional Split (Level 2)

09:03  
Lecture 133 
How to analyze the error files

16:40  
Lecture 134 
Due Dilligence: the one thing you HAVE to do every time

04:57  
Lecture 135 
Types of Errors in SSIS

04:00  
Lecture 136 
Summary

19:06  
Lecture 137 
Homework

03:39  
Section 21: SQL Programming for Data Science  
Lecture 138 
Intro (what you will learn in this section)

00:31  
Lecture 139 
Download the dataset for this section

00:38  
Lecture 140 
Getting To Know MS SQL Management Studio

02:14  
Lecture 141 
Shortcut to upload the data

04:20  
Lecture 142 
SELECT * Statement

05:52  
Lecture 143 
Using the WHERE clause to filter data

05:50  
Lecture 144 
How to use Wildcards / Regular Expressions in SQL (% and _)

04:38  
Lecture 145 
Comments in SQL

02:43  
Lecture 146 
Order By

05:49  
Lecture 147 
Data Types in SQL

07:54  
Lecture 148 
Implicit Data Conversion in SQL

03:35  
Lecture 149 
Using Cast() vs Convert()

03:51  
Lecture 150 
Working with NULLs

05:03  
Lecture 151 
Understanding how LEFT, RIGHT, INNER, and OUTER joins work

06:18  
Lecture 152 
Joins with duplicate values

02:32  
Lecture 153 
Joining on multiple fields

05:21  
Lecture 154 
Practicing Joins

05:00  
Section 22: ETL Phase 3: Data Wrangling after the load  
Lecture 155 
Intro (what you will learn in this section)

00:57  
Lecture 156 
RAW, WRK, DRV tables

05:54  
Lecture 157 
Download the dataset for this section

01:32  
Lecture 158 
Create your first Stored Proc in SQL

03:30  
Lecture 159 
Executing Stored Procedures

02:49  
Lecture 160 
Modifying Stored Procedures

08:25  
Lecture 161 
Create table

09:30  
Lecture 162 
Insert INTO

05:42  
Lecture 163 
Check if table exists + drop table + Truncate

05:59  
Lecture 164 
Intermediate Recap – Procs

04:16  
Lecture 165 
Create the proc for the second file

11:36  
Lecture 166 
Adding leading zeros

07:29  
Lecture 167 
Converting data on the fly

10:21  
Lecture 168 
How to create a proc template

07:52  
Lecture 169 
Archiving Procs

04:38  
Lecture 170 
What you can do with these tables going forward [drv files etc.]

13:50  
Section 23: Handling errors during ETL (Phase 3)  
Lecture 171 
Intro (what you will learn in this section)

00:53  
Lecture 172 
Download the dataset for this section

00:46  
Lecture 173 
Upload the data to RAW table

11:02  
Lecture 174 
Create Stored Proc

05:09  
Lecture 175 
How to deal with errors using the isnumeric() function

07:45  
Lecture 176 
How to deal errors using the len() function

07:36  
Lecture 177 
How to deal with errors using the isdate() function

07:40  
Lecture 178 
Additional Quality Assurance check: Balance

03:51  
Lecture 179 
Additional Quality Assurance check: ZipCode

03:17  
Lecture 180 
Additional Quality Assurance check: Birthday

04:08  
Lecture 181 
Part Completed

09:52  
Lecture 182 
ETL Error Handling “Vehicle Service” Project

07:45  
Section 24: ————————— Part 4: Communication —————————  
Lecture 183 
Welcome to Part 4

01:31  
Section 25: Working with people  
Lecture 184 
Intro (what you will learn in this section)

00:44  
Lecture 185 
Crossdepartmental Work

04:13  
Lecture 186 
Come to me with a Business Problem

02:10  
Lecture 187 
Setting expectations and preproject communication

03:30  
Lecture 188 
Go and sit with them

05:20  
Lecture 189 
The art of saying “No”

05:24  
Lecture 190 
Sometimes you have to go to the top

02:37  
Lecture 191 
Building a data culture

05:07  
Section 26: Presenting for Data Scientists  
Lecture 192 
Intro (what you will learn in this section)

01:42  
Lecture 193 
Case study

02:00  
Lecture 194 
Analysing the intro

03:33  
Lecture 195 
Intro dissection – recap

09:26  
Lecture 196 
REAL Data Science Presentation Walkthrough – Make Your Audience Say “WOW”

16:29  
Lecture 197 
My brainstorming method

03:03  
Lecture 198 
How to present to executives

05:27  
Lecture 199 
The truth is not always pretty

02:45  
Lecture 200 
Passion and the Wowfactor

01:59  
Lecture 201 
Bonus: my full presentation  LIVE 2015

16:20  
Lecture 202 
Bonus: links to other examples of good storytelling

Article  
Section 27: Homework Solutions  
Lecture 203 
Advanced Data Mining with Tableau: Visualising Credit Score & Tenure

05:44  
Lecture 204 
Advanced Data Mining with Tableau: ChiSquared Test for Country

04:18  
Lecture 205 
ETL Error Handling (Phases 1 and 2)

19:51  
Lecture 206 
ETL Error Handling “Vehicle Service” Project (Part 1 of 3)

19:09  
Lecture 207 
ETL Error Handling “Vehicle Service” Project (Part 2 of 3)

10:41  
Lecture 208 
ETL Error Handling “Vehicle Service” Project (Part 3 of 3)

14:34  
Section 28: Special Bonuses for Course Students  
Lecture 209 
** NEW BONUS For Students: EXCLUSIVE Discount!! **

Article 
Instructor Biography
Kirill Eremenko, Data Scientist & Forex Systems Expert
My name is Kirill Eremenko and I am superpsyched that you are reading this!
I teach courses in two distinct Business areas on Udemy: Data Science and Forex Trading. I want you to be confident that I can deliver the best training there is, so below is some of my background in both these fields.
Data Science
Professionally, I am a Data Science management consultant with over five years of experience in finance, retail, transport and other industries. I was trained by the best analytics mentors at Deloitte Australia and today I leverage Big Data to drive business strategy, revamp customer experience and revolutionize existing operational processes.
From my courses you will straight away notice how I combine my reallife experience and academic background in Physics and Mathematics to deliver professional stepbystep coaching in the space of Data Science. I am also passionate about public speaking, and regularly present on Big Data at leading Australian universities and industry events.
Forex Trading
Since 2007 I have been actively involved in the Forex market as a trader as well as running programming courses in MQL4. Forex trading is something I really enjoy, because the Forex market can give you financial, and more importantly – personal freedom.
In my other life I am a Data Scientist – I study numbers to analyze patterns in business processes and human behaviour… Sound familiar? Yep! Coincidentally, I am a big fan of Algorithmic Trading 🙂 EAs, Forex Robots, Indicators, Scripts, MQL4, even java programming for Forex – Love It All!
Summary
To sum up, I am absolutely and utterly passionate about both Data Science and Forex Trading and I am looking forward to sharing my passion and knowledge with you!