Skip to content

mrkr188/Data-Science-MOOCs

Repository files navigation

Introduction

Data Science is a exiting field and Data Scientists need to be have a broad skillset. To strengthen my understanding in various concepts, as well as get more hands on experience, I have persued many Data Science related MOOCs. This directory contains description of various courses i persued along with the certificates I gained.

I divide essential skillsets into these following 4 categories and persue courses in areas where I need improvement.

Maths

  • Probability and Statistics
  • Linear Algibra
  • Calculus

Programming Skills

  • Python, R programming languages
  • Data Structures and Algorithms

Systems

  • Relational Databases, SQL
  • Hadoop Ecosystem - HDFS, MapReduce, PIG, Hive
  • NoSQL systems - HBase, MongoDB, Cassandra
  • Cloud Computing concepts - Amazon Web Services

Data Science Core

  • Data Wrangling, Cleaning, Manipulation and Exploratory Data Analytics
  • Machine Learning
  • Information Retrieval
  • Data Mining
  • Data Visualization

MOOCs Description and Certificates

Data Science Specialization

Specialization Link

The Data Scientist’s Toolbox | Certificate
R Programming | Certificate
Getting and Cleaning Data | Certificate
Exploratory Data Analysis | Certificate
Reproducible Research | Certificate
Statistical Inference | Certificate
Regression Models | Certificate
Practical Machine Learning | Certificate
Developing Data Products | Certificate
Data Science Capstone | Currently persuing

  • This Specialization covers the concepts and tools you'll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results.
  • Lot of practice in R programming and packages like dplyr, data.table, Hmisc, reshape2, ggplot2, lattice, caret, randomForest, tm, rmarkdown, shiny, sqldf, rsqlite, stringr, lubridate
  • Final Capstone Project to apply the skills learned by building a data product using real-world data.

The Analytics Edge | Course Link
Certificate

  • An applied understanding of many different analytics methods, including linear regression, logistic regression, CART, clustering, data visualization and mathematical optimization. Uses R programming language.
  • Lot of real world analytics projects serving as examples and excersises.

Statistical Learning | Course Link
Certificate

  • Introductory Machine Learning course with focus on supervised learning meathods like regression and classification. Uses R programming language.
  • Companion book An Introduction to Statistical Learning

Introduction to Computer Science and Programming Using Python | Course Link
Certificate

  • Object Oriented Programming, Data Structures, Testing and debugging, Basic Algorithms.
  • Challenging programming assignments

Algorithms: Design and Analysis | Course link
Certificate

  • Standard course covering concepts of fundemental algorithm design.
  • Challenging exercises.

Design of Computer Programs | Course link

  • Covers new concepts, patterns, and methods that will expand coding abilities. Uses Python programming language.
  • Very challenging excersises

Machine Learning | Course link

  • A broad introduction to machine learning, datamining, and statistical pattern recognition with emphasis on practical application of techniques. Excersises in MATLAB

Currently persuing courses

Mining Massive Datasets | Course link

  • Covers several topics like MapReduce, Link Analysis, Locality-Sensitive Hashing, Data Stream Mining, Recommender Systems, Dimensionality Reduction, Clustering, Support-Vector Machines, Decision Trees
  • Companion book Mining Massive Datasets
  • Very challening excersises.

Planning to do

Cloud Computing Applications | Course link

  • Basic concepts underlying cloud services
  • Data Analytics using colud services(Amazon Web Services). Covers many concepts underlying systems like Hadoop(HDFS, PIG, Hive), YARN, NoSQL databases(Hbase, Cassandra), Spark, GraphX, Manhot

###Some good books Advanced R | link
An Introduction to Statistical Learning | link
The Elements of Statistical Learning | link
Mining of Massive Datasets | link
Hadoop: The Definitive Guide | [Not free]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published