HADOOP

HADOOP : IT Training  : Scope n Overview



Hadoop Online Training

About Hadoop Online Training Course

We are providing Hadoop Online Training with live real-time examples and with an in-depth explanation. By this type of teaching methodology, every student or professionals can understand the main Hadoop Course concepts very easily.

Course Duration

  • 40 Hrs / Daily 1:00  Hour
  • Course Fee: Actual Course fee 23,000/- Only. But Offer fee Rs.15,000/ – Only

Hadoop Online Training Content

Introduction To Hadoop

  • What is Enterprise BIGDATA
  • What is Hadoop?
  • History of Hadoop
  • Hadoop Eco-System
  • Hadoop Framework
  • Hadoop vs RDBMS
  • Hadoop vs SAP Hana vs Teradata
  • How ETL tools works in Hadoop
  • Hadoop Requirements and supported versions
  • Case Studies: Hadoop and Hive at Yahoo, Facebook etc…

Hadoop Distributed File Systems

  • Installation of Ubuntu 13.04 *
  • Basic Unix Commands *
  • Hadoop Commands
  • HDFS & Job Tracker Access URLs & ports.
  • HDFS design
  • Hadoop file systems
  • Master and Slave node architecture
  • Filesystem API – Java
  • Serialization in Hadoop – Reading and writing data from/to Hadoop URL

Administering Hadoop

  • Cluster specification
  • Hadoop cluster setup and installation
  • Standalone
  • Pseudo-distributed mode
  • Fully distributed mode
  • fs, fsck, distcp, archive, —–
  • dfsadmin, balancer, jobtracker, tasktracker, namenode—-
  • Step-by-step multi-node installation
  • Hadoop Configuration
  • Namenode and datanode directory structure
  • User commands
  • Administration commands
  • Monitoring
  • Benchmarking a Hadoop cluster

Mapreduce

  • Map/Reduce Overview and Architecture
  • Developing Map/Red Jobs
  • Mapreduce Data types
  • Custom DataTypes/Writables
  • Input File Formats
  • Text Input File Format
  • Zip File Input Format
  • LZO Compression & LZO Input Format
  • XML Input Format
  • JSON Input Format
  • Packaging, Launching, Debugging jobs
  • Hash Partitioner
  • Custom Partitioner
  • Capacity Scheduler
  • Fair Scheduler
  • Output Formats
  • Job Configuration
  • Job Submission
  • Mapreduce workflows
  • Practicing Map Reduce Programs
  • Combiner
  • Partitioner
  • Search
  • Sorting
  • Secondary Sorting
  • Distributed Cache
  • Chain Mapping/Reducing
  • Scheduling
  • One Example for Each Concept*
  • Practical Examples execution on Local, HDFS and Using Eclipse Plugins* too.

HIVE

  • Hive concepts
  • Hive installation
  • Hive configuration, hive services &  metastore
  • Hive datatypes – primitive and complex types
  • Hive operators
  • Hive Builtin functions
  • Hive Tables
  • creating tables
  • External Table
  • Internal Table
  • Partitions and buckets
  • Browsing tables and partitions
  • Storage formats
  • Loading data
  • Joins
  • Aggregations and sorting
  • Insert into local files
  • Altering, dropping tables
  • Importing data

PIG

  • Why pig
  • Pig and Pig latin
  • Pig installation
  • Pig latin command
  • Pig latin relational operators
  • Pig latin diagnostic operators
  • Data types and Expressions
  • Builtin functions
  • Data processing in pig
  • load and store
  • Filtering the data
  • Grouping the data
  • Joining the data
  • Sorting the data

Sqoop

  • Sqoop installation
  • Sqoop commands
  • Sqoop connectors
  • Importing the data from mysql
  • Exporting the data
  • Creating hive tables by importing data

HBase

  • HBase Introduction.
  • HBase Installation
  • HBase Architecture
  • Zoo Keeper
  • Keys & Column families
  • Integration with MapReduce
  • Integration with Hive

Other Miscellaneous Topics

  • Hue
  • Impala
  • Hadoop Streaming
  • Storm – Real Time Hadoop
  • Eclipse Plugins
  • Cloudera Hadoop Installation
  • Cloudera Administration
  • Hiho ecosystem
  • Flume ecosystem
  • Reporting Tools Introduction

New Updated Modules

  • Hadoop with Sparx
  • Kudos
  • Impala

Comments

Popular posts from this blog

SAP NWA: Net Weaver Administration Tool

Advanced JAVA

SAP BASIS SPOOLING : Printing problems and general issues