Online
12 weeks
IITM Pravartak Technologies Foundation
Technology Innovation Hub (TIH) of IIT Madras
and
GITAA
UG & PG students, Industry Professionals
Introduction to data science Why Python for data science
Setting working directory
Creating and saving script files
Executing pieces of code
Commenting
Clearing the environment and console
Removing variables from environment
Commenting script files
Creating variables in Python and naming conventions
Arithmetic operators
Logical operators
Data types and related conversions
Strings
Lists
Arrays
Tuples
Dictionary
Sets
Range
ndArray
Descriptive statistics
- Measures of central tendency
- Measures of spread
- Distribution of mean and variance
- Sampling basics
- Notion of probability
Reading files
- Comma separated value files
- Tab-delimited files
- Excel files
Exploratory data analysis
Data preparation and pre- processing
Scatter Plot
Bar Plot
Histogram
Box plot
Pair plot
If-else-if family
For loop
For loop with” if break”
While loop
Introduction to hypotheses testing
Performance of hypotheses tests
Test for mean (one sample)
Test for differences in means (two sample test)
Test for differences in variances (F test)
Eigen values & Eigen vectors
- Singular Value Decomposition
Understanding independence of variables
Understanding relationships between variables
Basics of optimization - objective function, constraints,
decision variables
Types of optimization problems
Statement of first order KKT necessary conditions
Basic concepts in multi-objective optimization
Introduction to optimization viewpoint in predictive
modelling and machine learning
Correlation
Basics of regression
Ordinary least squares
Model building
Model assessment and improvement
Diagnostics
Multiple linear regression (model building & assessment)
Random forest & Decision tree
Classification
- Logistic regression
- K nearest neighbours
Clustering
- K means
Dimensionality reduction methods
- Principal component analysis and its variants
Participants
Support vector machine
Neural networks