There is a huge difference between MapReduce1 and MapReduce2(YARN), in map reduce 1, they are using job tracker and task tracker to management the process, while YARN uses the resource manager, application manager. Etc to improve the performance.
Since MapReduce1 has already played its part in the Hadoop history for a while, we will study how the map reduce works from a perspective of studying history.
I downloaded the older version of hadoop release 1.2.1 from Here. You can also view the source code from Apache SubVersion online here:
When you set all of this up, and dive to the directory mapred.org.apache.ahdoop.mapred, you will be amazed at how many classes they have in that directory. There are about 200 classes in that directory. From this perspective, we can see this is the central place where all the map-reduce magic happens. Now let’s star this journey with the JobTracker who is the “central location for submitting and tracking MR jobs in a network environment”. It has about 5000 lines of code. Since this is not some sort tutorial but nothing other than some random study notes by a Java layman. I will first post the basic structure of this class, like the content of a book based on the author’s comment, marked by the line number.
JobTracker:
- 0 foreplay
- 1500 Real JobTracker
- -propertis
- -constructors.
- 2300 Lookptable JobinProg and TaskinProg
- -create
- -remote
- -mark
- 2515 Accessor
- -jobs
- -tasks
- 2909 InterTrackerProtocol
- -heartbeat
- -update..
- 3534 JobSubmission Protol
- -getjobid
- -submitjob
- -killjob
- -setjobpriority
- -getjobprofile
- -status
- 4354 Job Tracker Methods
- 4408 Methods to track TT
- -update
- -lost
- -refresh
- 4697 Main (debug)
- 4876 Check the job if it has invalid requirements
- 4995 MXBean implementation
- -blacklist
- -graylist
- 5110 JobTracker SafeMode