PrepAway - Latest Free Exam Questions & Answers

Category: CCA-500 (v.1)

Exam CCA-500: Cloudera Certified Administrator for Apache Hadoop (CCAH) (update February 13th, 2016)

Which is the most efficient process to gather these web server across logs into your Hadoop cluster analysis?

You want to understand more about how users browse your public website. For example,
you want to know which pages they visit prior to placing an order. You have a server farm of
200 web servers hosting your website. Which is the most efficient process to gather these
web server across logs into your Hadoop cluster analysis?

Which data serialization system gives the flexibility to do this?

You need to analyze 60,000,000 images stored in JPEG format, each of which is
approximately 25 KB. Because you Hadoop cluster isn’t optimized for storing and
processing many small files, you decide to do the following actions: 1. Group the individual
images into a set of larger files 2. Use the set of larger files as input for a MapReduce job
that processes them directly with python using Hadoop streaming. Which data serialization
system gives the flexibility to do this?


Page 2 of 612345...Last »