Introduction
HDFS stores file in multiple equal large size block e.g. 64 MB, 128 MB etc. and MapReduce framework access and process these files in distributed environment.
The MapReduce framework works on key-value pairs, it has two key part Mapper and Reducer.Map Reducers read file and split and pass to Mapper. Mapper set the input as key-value pairs and pass to the intermediate for sorting and shuffling. Reducer takes the key and list of value, process and writes to the disk. Continue reading “Hadoop MapReduce”
Tag: Apache Hadoop
Apache Hadoop Setup
Hadoop 2.x is based on YARN architecture, which uses ResourceManagaer and ApplicationManager. ResourceManagaer manage recourses across cluster and Application Manager manages job life cycles. Continue reading “Apache Hadoop Setup”