Introduction To Data Science
Use the R Programming Language to execute data science projects and become a data scientist. Implement business solutions, using machine learning and predictive analytics.
The R language provides a way to tackle day-to-day data science tasks, and this course will teach you how to apply the R programming language and useful statistical techniques to everyday business situations.
With this course, you’ll be able to use the visualizations, statistical models, and data manipulation tools that modern data scientists rely upon daily to recognize trends and suggest courses of action.
Understand Data Science to Be a More Effective Data Analyst
●Use R and RStudio
●Master Modeling and Machine Learning
●Load, Visualize, and Interpret Data
Use R to Analyze Data and Come Up with Valuable Business Solutions
This course is designed for those who are analytically minded and are familiar with basic statistics and programming or scripting. Some familiarity with R is strongly recommended; otherwise, you can learn R as you go.
You’ll learn applied predictive modeling methods, as well as how to explore and visualize data, how to use and understand common machine learning algorithms in R, and how to relate machine learning methods to business problems.
All of these skills will combine to give you the ability to explore data, ask the right questions, execute predictive models, and communicate your informed recommendations and solutions to company leaders.
Contents and Overview
This course begins with a walk-through of a template data science project before diving into the R statistical programming language.
You will be guided through modeling and machine learning. You’ll use machine learning methods to create algorithms for a business, and you’ll validate and evaluate models.
You’ll learn how to load data into R and learn how to interpret and visualize the data while dealing with variables and missing values. You’ll be taught how to come to sound conclusions about your data, despite some real-world challenges.
By the end of this course, you’ll be a better data analyst because you’ll have an understanding of applied predictive modeling methods, and you’ll know how to use existing machine learning methods in R. This will allow you to work with team members in a data science project, find problems, and come up solutions.
You’ll complete this course with the confidence to correctly analyze data from a variety of sources, while sharing conclusions that will make a business more competitive and successful.
The course will teach students how to use existing machine learning methods in R, but will not teach them how to implement these algorithms from scratch. Students should be familiar with basic statistics and basic scripting/programming.
What are the requirements?
You should be familiar with basic scripting or programming, and basic statistics.
Familiarity with R is a plus. Familiarity with RStudio is a plus. We will teach you how to start with R and RStudio, but you want to install them on your computer prior to starting this course.
What am I going to get from this course?
Start and execute the steps of a data science project, from project definition to model evaluation.
Use machine learning techniques to build effective predictive models.
Learn how to find and correct common problems found in real world data.
What is the target audience?
The course is for analytically minded students who are looking for an introduction to applied predictive modeling methods, and who want to learn about what goes into successful data science projects. The course will teach students how to use existing machine learning methods in R, but will not teach them how to implement these algorithms from scratch. Students should be familiar with basic statistics and basic scripting/programming. Some familiarity with R is helpful; otherwise, students should be willing to learn R as they go. We will direct you to ready-to-go implementations and additional references throughout the course.
|Section 1: Course Overview|
Walk-through of a data science project
Starting with R and data
|Section 2: Modeling and Machine Learning|
Mapping Business to Machine Learning Tasks
Your Feedback is Valuable
Naive Bayes: background
Naive Bayes: practice
Linear Regression: background
Linear Regression: practice
Logistic Regression: background
Logistic Regression: practice
Decision Trees and Random Forest: background
Random Forest: practice
Generalized Additive Models
Support Vector Machines
Regularization for Linear and Logistic Regression
|Section 3: Data|
Loading Data in R
The Shape of Data
Dealing with Categorical Variables
Useful Data Transformations
|Section 4: Moving On|
Nina Zumel, Data Scientist, Win-Vector LLC
Nina Zumel, PhD, has over 10 years of experience in research, machine learning, and data science. She is a co-author of the popular book Practical Data Science with R, co-author of the EMC data scientist certification program, and blogs often on statistics, data science, and data visualization.
John Mount, Data Scientist, Win-Vector LLC
I am principal at with the data science consulting firm Win-Vector LLC. Win-Vector LLC specializes in data science research, implementation, and training. I have over 10 years of experience in research, teaching, machine learning, and data science.
I am co-author of the popular book Practical Data Science with R, and I blog often on mathematics, programming, machine learning, and optimization on the Win-Vector blog.
My profesional experience includes managing a data science group for Shopping dot com (an eBay company), working in price optimization for Rapt (acquired by Microsoft), and apply machine learning at a web-scale for Kosmix (acquired by Walmart online). My original fields of study were mathematics (AB UC Berkeley) and computer science (Ph.D. Carnegie Mellon) with a heavy emphasis on probability theory.
- Lectures 0
- Quizzes 0
- Duration 50 hours
- Skill level All level
- Language English
- Students 2548
- Assessments Self