HDFS stores file in multiple equal large size block e.g. 64 MB, 128 MB etc. and MapReduce framework access and process these files in distributed environment.
The MapReduce framework works on key-value pairs, it has two key part Mapper and Reducer.Map Reducers read file and split and pass to Mapper. Mapper set the input as key-value pairs and pass to the intermediate for sorting and shuffling. Reducer takes the key and list of value, process and writes to the disk. Continue reading “Hadoop MapReduce”