ZettaMine Labs Pvt. Ltd., 63/A, Rd Number 13, Giani Zail Singh Nagar, Film Nagar, Hyderabad, 500096

9121192119

info@nitwai.com

Sat-Sun: 9AM - 6PM

Data Mining with Python

About The Workshops

Cluster Analysis, Classification and Regression, SVM, SVC, SVR, Dimensionality Reduction, Apache Spark, Network Mining, Text Mining, Natural Language Processing, Count Vectorizer, TFIDF and more.

First Timers

Sorry, this course is not appropriate for newbies. Please take our introductory Python course (Python Is Easy) first, and then come back to this class after you have some experience.

Junior Engineers

You’ll get the most out of this course if you’re already comfortable with Python. The version of Python (2 or 3) doesn’t matter, as the instructor will use both, and the tooling shown allows either.

Senior Engineers

You should be fine in this course as long as you don’t already have experience with Data Mining. If you’re a Data Scientist, this course may be review for you. Check the syllabus below to make sure we’re covering topics that interest you.

Course Curriculum

38 Lectures, 7 Homeworks, 3 Large Projects

  • Introduction
  • Section Overview
  • Cleaning Data – Part A
  • Cleaning Data – Part B
  • Cleaning Data – Part C
  • What are Statistics? – Part A
  • What are Statistics? – Part B
  • Practical Examples of Data Mining
  • Sample Datasets
  • Section Review
  • Homework #1: Setup Your Workstation
  • Cluster Analysis – Part A
  • Cluster Analysis – Part B
  • Classification and Regression – Part A
  • Classification and Regression – Part B
  • Support Vector Machine – Part A
  • Support Vector Machine – Part B
  • Association, Correlation and Covariance – Part A
  • Association, Correlation and Covariance – Part B
  • Dimensionality Reduction
  • Homework #2: Correlation
  • Apache Spark – Part A
  • Apache Spark – Part B
  • Apache Spark – Part C
  • Homework #3: Apache Spark
  • Map vs FlatMap – Part A
  • Map vs FlatMap – Part B
  • Spark-ML
  • Transformers, Estimators, and Pipelines
  • Homework #4: Map and Flatmap
  • Project #1: Spark-ML
  • Text Mining – Part A
  • Text Mining – Part B
  • Network Mining
  • Python Matrix Libraries
  • Homework #5: Matrices
  • Mining a SQL Database – Part A
  • Mining a SQL Database – Part B
  • Homework #6: SQL Databases
  • Project #2: Thinking About Fake News
  • Key Concepts & Text Cleaning
  • Count Vectorizer, TFIDF
  • Examples with Spam Data
  • Tweaking the Spam Data Model
  • Pipelining with Spam Data
  • Summary Challenge
  • Homework #7: StackOverflow Dataset
  • Project #3 (Final Exam): Fake News Detection