Workshop on Big Data and Hadoop | Innovians Technologies

Big data is a popular term used to describe the exponential growth and availability of data, both structured and unstructured. And big data may be as important to business – and society – as the Internet has become.

Hadoop is 100% open or free source, and pioneered a fundamentally new way of storing and processing data. Instead of relying on expensive, proprietary hardware and different systems to store and process data, Hadoop enables distributed parallel processing of huge amounts of data across inexpensive, industry-standard servers that both store and process the data, and can scale without limits. With Hadoop, no data is too big. And in today's hyper-connected world where more and more data is being created every day, Hadoop's breakthrough advantages mean that businesses and organizations can now find value in data that was recently considered useless. The students would get to work on a Real Life Project on Big Data Analytics and gain hands on project.

Topics to be covered in BIG DATA Workshop

Session 1: BigData

How Big is this Big Data ?
Definition with Real Time Examples
How BigData is generated with Real Time Generation
Use of BigData-How Industry is utilizing BigData
Traditional Data Processing Technologies
Future of BigData!!!

Session 2: Hadoop

Why Hadoop?
What is Hadoop?
Hadoop vs RDBMS, Hadoop vs BigData
Brief history of Hadoop
Apache Hadoop Architecture
Problems with traditional large-scale systems
Requirements for a new approach
Anatomy of a Hadoop cluster
Hadoop Setup and Installation

Session 3: Hadoop Ecosystem

Brief Introduction about Hadoop EcoSystem (MapReduce, HDFS, Hive, PIG, HBase).

Session 4: HDFS

Concepts & Architecture
Data Flow (File Read , File Write)
Fault Tolerance
Shell Commands
Java Base API
Data Flow Archives
Coherency
Data Integrity
Role of Secondary NameNode
HDFS Programming Basics

Session 5: MapReduce

Theory
MapReduce Architecture
Data Flow (Map – Shuffle - Reduce)
MapRed vs MapReduce APIs
MapReduce Programming Basics
Programming [ Mapper, Reducer, Combiner, Partitioner ]

Session 6: HIVE & PIG

Architecture
Installation
Configuration
Hive vs RDBMS
Tables
DDL & DML
Partitioning & Bucketing
Hive Web Interface
Why Pig
Use case of Pig

Session 7: HBase

RDBMS Vs NoSQL
HBase Introduction

Duration: The duration of this workshop will be two consecutive days, with eight hour session each day in a total of sixteen hours properly divided into theory and hands on sessions.

Certification Policy:

Certificate of Participation for all the workshop participants.
At the end of this workshop, a small competition will be organized among the participating students and winners will be awarded with a 'Certificate of Excellence'.
Certificate of Coordination for the coordinators of the campus workshops.

Eligibility: There are no prerequisites. Anyone interested, can join this workshop.

Big Data & Hadoop

Big Data and Hadoop | Workshop on Big Data Analysis & Hadoop

Our Clients

For any Training Requirement or Workshops

Corporate Trainings

Workshops

Quick Links