Processing big amount of data and processing it at large scale has always been a task for the system, this is when engineers or Hadoop masters are required. Hadoop is a complete eco-system of open source project that provide us the frame work to deal with big data. Designed in a manner to manage from single server to thousands of machines, with local computation and storage the only challenge which is faced is the enormous investment and excessive time taken.

To pursue your career in big data management and learn Hadoop technology, it is beneficial if you have knowledge of core java and SQL but this is not mandatory.   


Module 1 - Introduction to Big Data and Hadoop

  • What is Hadoop?
  • What is Big data?
  • What comes under big data?
  • What are the challenges for processing big data?
  • Traditional Approach and its Limitation
  • What are the benefits of big data?
  • What technologies support big data?
  • Solution of big data
  • Hadoop Architecture
  • Advantages of Hadoop

Module 2 - HDFS

  • What is HDFS
  • Where to use HDFS
  • Where not to use HDFS
  • Features of HDFS
  • HDFS Architecture
  • HDFS Concepts
  • HDFS Architecture 1.x and its daemons
  • HDFS Architecture 2.x and its daemons
  • HDFS Federation
  • HDFS High availability
  • Scheduler
  • Rack Awareness
  • HDFS commands

Module 3- Understanding - Map-Reduce Basics

  • What is MapReduce
  • Why MapReduce
  • How MapReduce work
  • Partitioners
  • Combiners
  • Hadoop Streaming
  • Failures in MapReduce

Module 4- YARN : Yet Another Resource Negotiator

  • Introduction to YARN
  • Limitation of Current Architecture
  • Apache Hadoop Yarn – Concepts & Applications
  • JobSubmission and Job Initialization
  • Failure Handling in YARN
  • Task Failure
  • Application Master Failure
  • Node Manager Failure
  • Resource Manager Failure

Module 5 - HIVE

  • What is hive
  • Features of Hive
  • Architecture of Hive
  • Working of Hive
  • What is Schema on Write?
  • What is Schema on Read?
  • Advantages of Schema on Write
  • Advantages of Schema on Read
  • Disadvantages of Schema on Write
  • Disadvantages of Schema on Read
  • Hive datatypes
  • Hive Statements

Module 6 - PIG

  • What is pig?
  • Why Do We Need Apache Pig?
  • Features of Pig
  • Pig Architecture
  • Apache Pig Vs MapReduce
  • Apache Pig Vs SQL
  • Apache Pig Vs Hive
  • pig run modes
  • Pig latin concepts
  • Pig Data Types
  • Pig operators

Module 7 - SQOOP

  • Introduction
  • Working of sqoop
  • Sqoop Commands

Module 8 - FLUME

  • Apache Flume – Introduction
  • Applications of Flume
  • Advantages of Flume
  • Features of Flume
  • Apache Flume - Data Transfer In Hadoop
  • Apache Flume – Architecture
  • Apache Flume - Data Flow
  • Apache Flume - Sequence Generator Source

Copyright TIMTS, © 2017, TIMTS declares that all the images, photographs, logos, tradmarks, name of brands other than TIMTS, research facts & copyrighted content of any other brand are the property of their respective owners(or brands). It has been only used for illustrative purpose.