PrepAway - Latest Free Exam Questions & Answers

Where are Hadoop’s task log files stored?

For each job, the Hadoop framework generates task log files. Where are Hadoop’s task log files
stored?

PrepAway - Latest Free Exam Questions & Answers

A.
Cached on the local disk of the slave node running the task, then purged immediately upon task
completion.

B.
Cached on the local disk of the slave node running the task, then copied into HDFS.

C.
In HDFS, in the directory of the user who generates the job.

D.
On the local disk of the slave node running the task.

Explanation:
Job Statistics
These logs are created by the jobtracker. The jobtracker runtime statistics from jobs to thesefiles.
Those statistics include task attempts, time spent shuffling, input splits given to task attempts, start
times of tasks attempts and other information.
The statistics files are named:
<hostname>_<epoch-of-jobtracker-start>_<job-id>_<job-name>
where <hostname> is the hostname of the machine creating these logs, <epoch-of-jobtrackerstart> is the number of milliseconds that had elapsed since Unix Epoch when the jobtracker
daemon was started, <job-id> is the job ID, and <job-name> is the name of the job.
For example:

ec2-72-44-61-184.compute-1.amazonaws.com_1250641772616_job_200908190029_0002_hadoop_test-mini-mr
These logs are not rotated.You can clear these logs periodically without affecting Hadoop.
However, consider archiving the logs if they are of interest in the job development process. Make
sure you do not move or delete a file that is being written to by a running job.
Individual statistics logs are created for each job that is submitted to the cluster. The size of each
log file varies. Jobs with more tasks produce larger files.
Reference:
Apache Hadoop Log Files: Where to find them in CDH, and what info they contain

2 Comments on “Where are Hadoop’s task log files stored?

  1. Dev says:

    Actually logging in Hadoop 2.x is different then whats explained on “Apache Hadoop Log Files: Where to find them in CDH, and what info they contain”, logs for tasks(containers) are in userlog directory and once job is completed, if aggregation is enabled logs will go to hdfs.




    0



    0

Leave a Reply

Your email address will not be published. Required fields are marked *