PrepAway - Latest Free Exam Questions & Answers

How many Mappers will run?

On a cluster running MapReduce v2 (MRv2) on YARN, a MapReduce job is given a directory of 10
plain text files as its input directory. Each file is made up of 3 HDFS blocks. How many Mappers
will run?

PrepAway - Latest Free Exam Questions & Answers

A.
We cannot say; the number of Mappers is determined by the ResourceManager

B.
We cannot say; the number of Mappers is determined by the developer

C.
30

D.
3

E.
10

F.
We cannot say; the number of mappers is determined by the ApplicationMaster

17 Comments on “How many Mappers will run?

  1. Tarun says:

    The number of mappers is based on the the number of input split (Which is decided on InputFormat) so it depends on the developers I think B is the right one




    0



    0
  2. Gaurav says:

    Number of mappers depends on input split size, which is equal to block size in default so here it will be 30 mappers, but developer has option to overwrite this parameter and can decide the input split size which can further modify the number of mappers for the job.




    0



    0
  3. Ddhanaji says:

    I think the context here is – 1 file takes 3 Block means with replications factor 3 (default). Hence 1 job per file so its 10




    0



    0
  4. frank lin says:

    should be E , people choose C is because they forgot the file is actually only contain 1 block , the rest 2 are replica




    0



    0
  5. thom says:

    E. Based there’s no information about the split size, it could be larger than the block size.

    Number of Mappers depends on the number of splits, however if the files are less then the split size then each file will correspond to one mapper. that is the reason large number of small files are not recommended

    determining properties to decide split size and there default values are as follows

    mapred.min.split.size=1 (in bytes)
    mapred.max.split.size=Long.MAX_VALUE
    dfs.block.size=64 MB
    split size is calculated as

    inputSplitSize=max(minimumSize, min(maximumSize, blockSize))

    # of mappers= totalInputSize/inputSplitSize




    0



    0
  6. dip says:

    C.
    If you have not defined any input split size in Map/Reduce program then default HDFS block split will be considered as input split.There is no mention about replica.




    0



    0

Leave a Reply

Your email address will not be published. Required fields are marked *