Data Science and Machine Learning with Python – Hands On!

S$130.00
Take This Course
Data Science and Machine Learning with Python – Hands On!

Course Description

Data Scientists enjoy one of the top-paying jobs, with an average salary of $120,000 according to Glassdoor and Indeed. That’s just the average! And it’s not just about money – it’s interesting work too!

If you’ve got some programming or scripting experience, this course will teach you the techniques used by real data scientists in the tech industry – and prepare you for a move into this hot career path. This comprehensive course includes 68 lectures spanning almost 9 hours of video, and most topics include hands-on Python code examples you can use for reference and for practice. I’ll draw on my 9 years of experience at Amazon and IMDb to guide you through what matters, and what doesn’t.

The topics in this course come from an analysis of real requirements in data scientist job listings from the biggest tech employers. We’ll cover the machine learning and data mining techniques real employers are looking for, including:

Regression analysis
K-Means Clustering
Principal Component Analysis
Train/Test and cross validation
Bayesian Methods
Decision Trees and Random Forests
Multivariate Regression
Multi-Level Models
Support Vector Machines
Reinforcement Learning
Collaborative Filtering
K-Nearest Neighbor
Bias/Variance Tradeoff
Ensemble Learning
Term Frequency / Inverse Document Frequency
Experimental Design and A/B Tests

…and much more! There’s also an entire section on machine learning with Apache Spark, which lets you scale up these techniques to “big data” analyzed on a computing cluster.

If you’re new to Python, don’t worry – the course starts with a crash course. If you’ve done some programming before, you should pick it up quickly. This course shows you how to get set up on Microsoft Windows-based PC’s; the sample code will also run on MacOS or Linux desktop systems, but I can’t provide OS-specific support for them.

Each concept is introduced in plain English, avoiding confusing mathematical notation and jargon. It’s then demonstrated using Python code you can experiment with and build upon, along with notes you can keep for future reference.

If you’re a programmer looking to switch into an exciting new career track, or a data analyst looking to make the transition into the tech industry – this course will teach you the basic techniques used by real-world industry data scientists. I think you’ll enjoy it!
What are the requirements?
You’ll need a desktop computer (Windows, Mac, or Linux) capable of running Enthought Canopy 1.6.2 or newer. The course will walk you through installing the necessary free software.
Some prior coding or scripting experience is required.
At least high school level math skills will be required.
This course walks through getting set up on a Microsoft Windows based desktop PC. While the code in this course will run on other operating systems, we cannot provide OS-specific support for them.
What am I going to get from this course?
Extract meaning from large data sets using a wide variety of machine learning, data mining, and data science techniques with the Python programming language.
Perform machine learning on “big data” using Apache Spark and its MLLib package.
Design experiments and interpret the results of A/B tests
Visualize clustering and regression analysis in Python using matplotlib
Produce automated recommendations of products or content with collaborative filtering techniques
Apply best practices in cleaning and preparing your data prior to analysis
What is the target audience?
Software developers or programmers who want to transition into the lucrative data science career path will learn a lot from this course.
Data analysts in the finance or other non-tech industries who want to transition into the tech industry can use this course to learn how to analyze data using code instead of tools. But, you’ll need some prior experience in coding or scripting to be successful.
If you have no prior coding or scripting experience, you should NOT take this course – yet. Go take an introductory Python course first.

Curriculum

Section 1: Getting Started
Introduction
02:44
[Activity] Getting What You Need
02:37
[Activity] Installing Enthought Canopy
06:19
Python Basics, Part 1
15:58
[Activity] Python Basics, Part 2
09:41
Running Python Scripts
03:55
Section 2: Statistics and Probability Refresher, and Python Practise
Types of Data
06:58
Mean, Median, Mode
05:26
[Activity] Using mean, median, and mode in Python
08:30
[Activity] Variation and Standard Deviation
11:12
Probability Density Function; Probability Mass Function
03:27
Common Data Distributions
07:45
[Activity] Percentiles and Moments
12:33
[Activity] A Crash Course in matplotlib
13:46
[Activity] Covariance and Correlation
11:31
[Exercise] Conditional Probability
11:03
Exercise Solution: Conditional Probability of Purchase by Age
02:18
Bayes’ Theorem
05:23
Section 3: Predictive Models
[Activity] Linear Regression
11:01
[Activity] Polynomial Regression
08:04
[Activity] Multivariate Regression, and Predicting Car Prices
08:06
Multi-Level Models
04:36
Section 4: Machine Learning with Python
Supervised vs. Unsupervised Learning, and Train/Test
08:57
[Activity] Using Train/Test to Prevent Overfitting a Polynomial Regression
05:47
Bayesian Methods: Concepts
03:59
[Activity] Implementing a Spam Classifier with Naive Bayes
08:05
K-Means Clustering
07:23
[Activity] Clustering people based on income and age
05:14
Measuring Entropy
03:09
[Activity] Install GraphViz
Article
Decision Trees: Concepts
08:43
[Activity] Decision Trees: Predicting Hiring Decisions
09:47
Ensemble Learning
05:59
Support Vector Machines (SVM) Overview
04:27
[Activity] Using SVM to cluster people using scikit-learn
05:36
Section 5: Recommender Systems
User-Based Collaborative Filtering
07:57
Item-Based Collaborative Filtering
08:15
[Activity] Finding Movie Similarities
09:08
[Activity] Improving the Results of Movie Similarities
07:59
[Activity] Making Movie Recommendations to People
10:22
[Exercise] Improve the recommender’s results
05:29
Section 6: More Data Mining and Machine Learning Techniques
K-Nearest-Neighbors: Concepts
03:44
[Activity] Using KNN to predict a rating for a movie
12:29
Dimensionality Reduction; Principal Component Analysis
05:44
[Activity] PCA Example with the Iris data set
09:05
Data Warehousing Overview: ETL and ELT
09:05
Reinforcement Learning
12:44
Section 7: Dealing with Real-World Data
Bias/Variance Tradeoff
06:15
[Activity] K-Fold Cross-Validation to avoid overfitting
10:55
Data Cleaning and Normalization
07:10
[Activity] Cleaning web log data
10:56
Normalizing numerical data
03:22
[Activity] Detecting outliers
07:00
Section 8: Apache Spark: Machine Learning on Big Data
[Activity] Installing Spark – Part 1
07:02
[Activity] Installing Spark – Part 2
13:29
Spark Introduction
09:10
Spark and the Resilient Distributed Dataset (RDD)
11:42
Introducing MLLib
05:09
[Activity] Decision Trees in Spark

Preview

16:00
[Activity] K-Means Clustering in Spark
11:07
TF / IDF
06:44
[Activity] Searching Wikipedia with Spark
08:11
[Activity] Using the Spark 2.0 DataFrame API for MLLib
07:57
Section 9: Experimental Design
A/B Testing Concepts
08:23
T-Tests and P-Values
05:59
[Activity] Hands-on With T-Tests
06:04
Determining How Long to Run an Experiment
03:24
A/B Test Gotchas
09:26
Section 10: You made it!
More to Explore
02:59
Don’t Forget to Leave a Rating!
Article
Bonus Lecture: Discounts on my Spark and MapReduce courses!
01:28

Instructor Biography
Frank Kane, Data Miner and Software Engineer
Frank Kane spent 9 years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers, all the time. Frank holds 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis.

Course Features

  • Lectures
    Fatal error: Uncaught Error: Call to undefined method LP_Course::get_lessons() in /home2/waf6oo0adwwo/public_html/wp-content/themes/eduma/inc/learnpress-v2-functions.php:624 Stack trace: #0 /home2/waf6oo0adwwo/public_html/wp-content/themes/eduma/learnpress-v2/single-course/content-landing.php(64): thim_course_info() #1 /home2/waf6oo0adwwo/public_html/wp-content/plugins/learnpress/inc/lp-template-functions.php(2492): include('/home2/waf6oo0a...') #2 /home2/waf6oo0adwwo/public_html/wp-content/themes/eduma/learnpress-v2/content-single-course.php(65): learn_press_get_template('single-course/c...', Array) #3 /home2/waf6oo0adwwo/public_html/wp-content/plugins/learnpress/inc/lp-template-functions.php(2492): include('/home2/waf6oo0a...') #4 /home2/waf6oo0adwwo/public_html/wp-content/plugins/learnpress/inc/lp-template-functions.php(1563): learn_press_get_template('content-single-...') #5 /home2/waf6oo0adwwo/public_html/wp-includes/class-wp-hook.php(286): learn_press_content_single_course('') #6 /home2/waf6oo0adwwo/public_html/wp-i in /home2/waf6oo0adwwo/public_html/wp-content/themes/eduma/inc/learnpress-v2-functions.php on line 624