Data Analytics | Advanced Fast Track Summer Training on Data Analytics
Data analytics (DA) is the process of examining data sets in order to draw conclusions about the information they contain, increasingly with the aid of specialized systems and software. Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making.
Topics to be covered in Workshop
Basics of Data Science
- AI vs ML vs DL vs Data Science
- Data Science Scope, Applications
- Data Science Introduction
- Predictive v/s Descriptive Data Analysis
- Data Science v/s Data Analytics
- Regression & Classification Problems
- What makes a Data Science Expert?
- The art of making stories from Data
- Use Cases and Case Studies
(This part will be taken for 3 hour to refresh python basics considering participants have basic idea of atleast one programming language)
Introduction to Python Programming
- What is Python?
- Installing Anaconda
- Understanding the Spyder Integrated Development Environment (IDE)
- Python basics and string manipulation
- lists, tuples, dictionaries, variables
- Control Structure – If loop, For loop and while Loop
- Single line loops
- Writing user defined functions
- Object oriented programming
- Working with Class&Inheritance
Statistic for Data
- Measure of Central Tendency – Mean, Mode and Median
- Grouped and Ungrouped Data
- Measure of Spread – IQR, Variance and Standard Deviation
- Covariance
- Correlation
- Kurtosis, Skewness
Analyzing the categorical Data
- Proportional Test
- Chi Square Test
- Fisher’s Exact Test
- Mantel Henszel test
Analyzing the Continuous Data
- One Sample T-Test
- Two Independent Samples Tests
- Paired T-test
- Wilcoxon Test
- Anova
- Kruskal Wallis Test
Probabilistic Theory
- Events and their Probabilities
- Rules of Probability
- Conditional Probability and Independence
- Distribution of a Random Variable
- Bayes Theorem
- Moment Generating functions Central
- Limit Theorem
- Expectation & Variance
- Standard Distributions – Bernoulli, Binomial & Multinomial
Data Structure & Data Manipulation in Python
- Intro to Numpy Arrays
- Creating ndarrays
- Indexing
- Data Processing using Arrays
- Mathematical computing basics
- Basic statistics
- File Input and Output
- Getting Started with Pandas
- Data Acquisition (Import & Export)
- Indexing
- Selection and Filtering
- Sorting & Summarizing
- Descriptive Statistics
- Combining and Merging Data Frames
- Removing Duplicates
- Discretization and Binning
- String Manipulation
Visualization in python, case studies
- Introduction to Visualization
- Visualization Importance
- Visualization Rules
- Working with Python visualization libraries
- Matplotlib
- Creating Line Plots, Bar Charts, Pie Charts, Histograms, Scatter Plots
Working with Seaborn
- Data Visualization using Seaborn
- Basic Plots, color palettes
- Plotting categorical data
- Visualizing linear relationship
- Plotting on data-aware grids
- HeatMap, Histogram, Barplot, Factor plot
- Density Plot, Joint Distribution Plot
Linear Regression
- Regression Problem Analysis
- Mathematical modelling of Regression Model
- Gradient Descent Algorithm
- Programming Process Flow
- Use cases
- Regression Table
- Heteroscedasticity
- Model Specification
- L1 & L2 Regularization
Linear Regression – Case Study & Project
- Programming Using python
- Building simple Univariate Linear Regression Model
- Multivariate Regression Model
- Apply Data Transformations
- Identify Multicollinearity in Data Treatment on Data
- Identify Heteroscedasticity
- Modelling of Data
- Variable Significance Identification
- Model Significance Test
- Bifurcate Data into Training / Testing Data set
- Build Model on Training Data Set
- Predict using Testing Data Set
- Validate the Model Performance
- Project 1: Boston Housing Prizes Prediction
- Project 2: Cancer Detection Predictive Analysis
- Best Fit Line and Linear Regression
Logistic Regression
- Variable and Model Significance
- Maximum Likelihood Concept
- Log Odds and Interpretation
- Regression Table
- Null Vs Residual Deviance
- Problem Analysis
- Cost Function Formation
- Mathematical Modelling
- Use Cases
Case Study & Project
- Model Parameter Significance Evaluation
- Drawing the ROC Curve
- Estimating the Classification Model Hit Ratio
- Isolating the Classifier for Optimum Results
- Project 3: Digit Recognition using Logistic Regression
Decision Trees with Case Study
- Forming a Decision Tree
- Components of Decision Tree
- Mathematics of Decision Tree
- Decision Tree Evaluation
- Practical Examples & Case Study
- Project 4: Intrusion Detection
Random Forests
- Random Forest Mathematics
- Examples & use cases using Random Forests
K-NN Algorithm – Applications & Case Studies
- Understanding the KNN
- Distance metrics
- Case Study on KNN
Support Vector Machine
- Concept and Working Principle
- Mathematical Modelling
- Optimization Function Formation
- The Kernel Method and Nonlinear Hyperplanes
- Use Cases
- Programming SVM using Python
- Project 5- Character recognition using SVM
- Project 6- Regression problem using SVM
- Project 7- Wisconsin Cancer Detection using SVM
Clustering
- Hierarchical Clustering
- K Means Clustering
- Use Cases for K Means Clustering
- Programming for K Means using Python
- Image Color Quantization using K Means Clustering Technique
- Cluster Size Optimization vs Definition Optimization
- Projects & Case Studies
Principle Component Analysis
- Dimensionality Reduction, Data Compression
- Curse of dimensionality
- Multicollinearity
- Factor Analysis
- Concept and Mathematical modelling
- Use Cases
- Programming using Python
Eligibility For Applying : Any college can opt for this program. Students/Faculties from B.E./B.Tech/M.Tech/M.E./MCA/BCA. with the above mentioned academic requirements are preferable. We are looking for Engineering College of National repute to host our Summer Training Program at their college campus. If You wish to associate with us and want to make your college a center, Please read the request guidelines and process it accordingly:
- Engineering College/Institution should have a seminar room/lecture hall of seating capacity of (Minimum capacity of 30 seats.)
- A good Quality LCD Projector enough to maintain comfort ability.
- Public Addressing System (1 Cordless MICs.)
- Power backup and 220V AC power Points.
- Hospitality for our visiting delegation during summer training.
- Wifi Connectivity for participants & trainer.
- A minimum of 30 students will be required to participate in Summer Training Program.
Certification:
- All courses under Summer Training Program (STP) are certified by Anwesha, IIT Patna & Innovians Technologies.
- All trainee will receive an Industrial Training Certificate from Anwesha, IIT Patna & Innovians Technologies.
- All Participants will get Internship Letter cum Recommendation on Company Letterhead of Innovians Technologies.
Duration : 2 Weeks /4 Weeks * (6 Hours Per Day)
Incase of 4 Weeks Training (2 Week Classroom Training + 2 Week Project Work), participants will get 2 Week Classroom Training same like 2 Week Training Program and they will get 2 Week Time to complete one project/research work on their course topic which they can do it from home. Project Work is for those who want to get 4 Week Training Certificate as per their college/university Criteria of Training. Otherwise college can request for 2 Week Training Program. Classroom Training is same for 2 Weeks & 4 Weeks Duration Summer Training.