Data analytics (DA) is the process of examining data sets in order to draw conclusions about the information they contain, increasingly with the aid of specialized systems and software. Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making.

Topics to be covered in Workshop 

Basics of Data Science

  • AI vs ML vs DL vs Data Science
  • Data Science Scope, Applications
  • Data Science Introduction
  • Predictive v/s Descriptive Data Analysis
  • Data Science v/s Data Analytics
  • Regression & Classification Problems
  • What makes a Data Science Expert?
  • The art of making stories from Data
  • Use Cases and Case Studies 

(This part will be taken for 3 hour to refresh python basics considering participants have basic idea of atleast one programming language)

Introduction to Python Programming 

  • What is Python?
  • Installing Anaconda
  • Understanding the Spyder Integrated Development Environment (IDE)
  • Python basics and string manipulation
  • lists, tuples, dictionaries, variables
  • Control Structure – If loop, For loop and while Loop
  • Single line loops
  • Writing user defined functions
  • Object oriented programming
  • Working with Class&Inheritance 

Statistic for Data

  • Measure of Central Tendency – Mean, Mode and Median
  • Grouped and Ungrouped Data
  • Measure of Spread – IQR, Variance and Standard Deviation
  • Covariance
  • Correlation
  • Kurtosis, Skewness 

Analyzing the categorical Data

  • Proportional Test
  • Chi Square Test
  • Fisher’s Exact Test
  • Mantel Henszel test 

Analyzing the Continuous Data

  • One Sample T-Test
  • Two Independent Samples Tests
  • Paired T-test
  • Wilcoxon Test
  • Anova
  • Kruskal Wallis Test 

Probabilistic Theory

  • Events and their Probabilities
  • Rules of Probability
  • Conditional Probability and Independence
  • Distribution of a Random Variable
  • Bayes Theorem
  • Moment Generating functions Central
  • Limit Theorem
  • Expectation & Variance
  • Standard Distributions – Bernoulli, Binomial & Multinomial

Data Structure & Data Manipulation in Python

  • Intro to Numpy Arrays
  • Creating ndarrays
  • Indexing
  • Data Processing using Arrays
  • Mathematical computing basics
  • Basic statistics
  • File Input and Output
  • Getting Started with Pandas
  • Data Acquisition (Import & Export)
  • Indexing
  • Selection and Filtering
  • Sorting & Summarizing
  • Descriptive Statistics
  • Combining and Merging Data Frames
  • Removing Duplicates
  • Discretization and Binning
  • String Manipulation 

Visualization in python, case studies

  • Introduction to Visualization
  • Visualization Importance
  • Visualization Rules
  • Working with Python visualization libraries
  • Matplotlib
  • Creating Line Plots, Bar Charts, Pie Charts, Histograms, Scatter Plots 

Working with Seaborn

  • Data Visualization using Seaborn
  • Basic Plots, color palettes
  • Plotting categorical data
  • Visualizing linear relationship
  • Plotting on data-aware grids
  • HeatMap, Histogram, Barplot, Factor plot
  • Density Plot, Joint Distribution Plot 

Linear Regression

  • Regression Problem Analysis
  • Mathematical modelling of Regression Model
  • Gradient Descent Algorithm
  • Programming Process Flow
  • Use cases
  • Regression Table
  • Heteroscedasticity
  • Model Specification
  • L1 & L2 Regularization 

Linear Regression – Case Study & Project

  • Programming Using python
  • Building simple Univariate Linear Regression Model
  • Multivariate Regression Model
  • Apply Data Transformations
  • Identify Multicollinearity in Data Treatment on Data
  • Identify Heteroscedasticity
  • Modelling of Data
  • Variable Significance Identification
  • Model Significance Test
  • Bifurcate Data into Training / Testing Data set
  • Build Model on Training Data Set
  • Predict using Testing Data Set
  • Validate the Model Performance
  • Project 1: Boston Housing Prizes Prediction
  • Project 2: Cancer Detection Predictive Analysis
  • Best Fit Line and Linear Regression 

Logistic Regression

  • Variable and Model Significance
  • Maximum Likelihood Concept
  • Log Odds and Interpretation
  • Regression Table
  • Null Vs Residual Deviance
  • Problem Analysis
  • Cost Function Formation
  • Mathematical Modelling
  • Use Cases 

Case Study & Project

  • Model Parameter Significance Evaluation
  • Drawing the ROC Curve
  • Estimating the Classification Model Hit Ratio
  • Isolating the Classifier for Optimum Results
  • Project 3: Digit Recognition using Logistic Regression 

Decision Trees with Case Study

  • Forming a Decision Tree
  • Components of Decision Tree
  • Mathematics of Decision Tree
  • Decision Tree Evaluation
  • Practical Examples & Case Study
  • Project 4: Intrusion Detection 

Random Forests

  • Random Forest Mathematics
  • Examples & use cases using Random Forests 

K-NN Algorithm – Applications & Case Studies

  • Understanding the KNN
  • Distance metrics
  • Case Study on KNN 

Support Vector Machine

  • Concept and Working Principle
  • Mathematical Modelling
  • Optimization Function Formation
  • The Kernel Method and Nonlinear Hyperplanes
  • Use Cases
  • Programming SVM using Python
  • Project 5- Character recognition using SVM
  • Project 6- Regression problem using SVM
  • Project 7- Wisconsin Cancer Detection using SVM 


  • Hierarchical Clustering
  • K Means Clustering
  • Use Cases for K Means Clustering
  • Programming for K Means using Python
  • Image Color Quantization using K Means Clustering Technique
  • Cluster Size Optimization vs Definition Optimization
  • Projects & Case Studies 

Principle Component Analysis

  • Dimensionality Reduction, Data Compression
  • Curse of dimensionality
  • Multicollinearity
  • Factor Analysis
  • Concept and Mathematical modelling
  • Use Cases
  • Programming using Python

Eligibility For Applying : Any college can  opt for this program. Students/Faculties from B.E./B.Tech/M.Tech/M.E./MCA/BCA. with the below mentioned academic requirements are preferable. If You wish to associate with us and want to organise this training at your institute then, Please read the request guidelines and process it accordingly:

  • Engineering College/Institution should have a seminar room/lecture hall of seating capacity of (Minimum capacity of 40 seats.)
  • A good Quality LCD Projector enough to maintain comfort ability.
  • Public Addressing System (1 Cordless MICs.)
  • Power backup and 220V AC power Points.
  • Hospitality for our visiting delegation during training program.
  • Wifi Connectivity for participants & trainer.

Certification Policy:

  • Certificate of Merit for all the workshop participants from Innovians Technologies.
  • Certificate of Coordination for the coordinators of the campus workshops

Duration: 2 Weeks (5 Days a Week) - The duration of this workshop will be two consecutive weeks, with 6-7 hour session each day.

Fees: Rs. 5000/- per participant  (Min 40 Participants). 

Our Clients