Alan Kaminsky Department of Computer Science Rochester Institute of Technology 4486 + 2220 = 6706
Home Page
Distributed Systems 4005-730-01 Spring Quarter 2013
Course Page

4005-730 Distributed Systems
Lecture Notes -- Module 8. Map-Reduce Systems

Prof. Alan Kaminsky
Rochester Institute of Technology -- Department of Computer Science



http://www.dilbert.com


The Map-Reduce Paradigm

               YYYY                                                                    TTTTTQ
0029029070999991901010106004+64333+023450FM-12+000599999V0202701N015919999999N0000001N9-00781+99999102001ADDGF108991999999999999999999
0029029070999991901010113004+64333+023450FM-12+000599999V0202901N008219999999N0000001N9-00721+99999102001ADDGF104991999999999999999999
0029029070999991901010120004+64333+023450FM-12+000599999V0209991C000019999999N0000001N9-00941+99999102001ADDGF108991999999999999999999
0029029070999991901010206004+64333+023450FM-12+000599999V0201801N008219999999N0000001N9-00611+99999101831ADDGF108991999999999999999999
0029029070999991901010213004+64333+023450FM-12+000599999V0201801N009819999999N0000001N9-00561+99999101761ADDGF108991999999999999999999
0029029070999991901010220004+64333+023450FM-12+000599999V0201801N009819999999N0000001N9-00281+99999101751ADDGF108991999999999999999999
0029029070999991901010306004+64333+023450FM-12+000599999V0202001N009819999999N0000001N9-00671+99999101701ADDGF106991999999999999999999
0029029070999991901010313004+64333+023450FM-12+000599999V0202301N011819999999N0000001N9-00331+99999101741ADDGF108991999999999999999999
0029029070999991901010320004+64333+023450FM-12+000599999V0202301N011819999999N0000001N9-00281+99999101741ADDGF108991999999999999999999
0029029070999991901010406004+64333+023450FM-12+000599999V0209991C000019999999N0000001N9-00331+99999102311ADDGF108991999999999999999999
Input data


Figure 2-1. MapReduce logical data flow
White, op. cit.


Figure 2-2. MapReduce data flow with a single reduce task
White, op. cit.


Figure 2-3. MapReduce data flow with multiple reduce tasks
White, op. cit.


Figure 2-4. MapReduce data flow with no reduce tasks
White, op. cit.


Hadoop


Hadoop Architecture


Figure 6-1. How Hadoop runs a MapReduce Job
White, op. cit.


Hadoop Distributed File System (HDFS)


Figure 3-1. A client reading data from HDFS
White, op. cit.


Figure 3-3. A client writing data to HDFS
White, op. cit.


What Map-Reduce Is Good For

Distributed Systems 4005-730-01 Spring Quarter 2013
Course Page
Alan Kaminsky Department of Computer Science Rochester Institute of Technology 4486 + 2220 = 6706
Home Page
Copyright © 2013 Alan Kaminsky. All rights reserved. Last updated 26-Apr-2013. Please send comments to ark­@­cs.rit.edu.