E20-007 | EMC Exam Questions

How should you proceed?

seenagapeJanuary 4, 2013 Leave a comment

You are given 10, 000, 000 user profile pages of an online dating site in XML files, and they are stored in HDFS. You are assigned to divide the users into groups based on the content of their profiles. You have been instructed to try K-means clustering on this data. How should you proceed?

In which lifecycle stage are initial hypotheses formed?

seenagapeJanuary 4, 2013 Leave a comment

In which lifecycle stage are initial hypotheses formed?

Which type of join would you use for this table?

seenagapeJanuary 4, 2013 Leave a comment

You have two tables of customers in your database. Customers in cust_table_1 were sent an e-mail promotion last year, and customers in cust_table_2 received a newsletter last year.
Customers can only be entered in once per table. You want to create a table that includes all customers, and any of the communications they received last year. Which type of join would you use for this table?

what is a way that you could try to increase the R2 of the model without artificially inflating it?

seenagapeJanuary 4, 2013 Leave a comment

Refer to exhibit.

You are asked to write a report on how specific variables impact your clients sales using a data set provided to you by the client. The data includes 15 variables that the client views as directly related to sales, and you are restricted to these variables only.
After a preliminary analysis of the data, the following findings were made:
1. Multicollinearity is not an issue among the variables
2. Only three variablesA, B, and Chave significant correlation with sales
You build a linear regression model on the dependent variable of sales with the independent variables of A, B, and C. The results of the regression are seen in the exhibit.
You cannot request additional datA. what is a way that you could try to increase the R2 of the model without artificially inflating it?

What is the sum of the probabilities that the model assigns to all the filers in your training set that have b

seenagapeJanuary 4, 2013 3 comments

You are building a logistic regression model to predict whether a tax filer will be audited within the next two years. Your training set population is 1000 filers. The audit rate in your training data is 4.2%. What is the sum of the probabilities that the model assigns to all the filers in your training set that have been audited?

Page 12 of 17« First «...10 111213 14...»Last »

Get 50% Discount on All Your Purchases
at PrepAway.com - Latest Exam Questions

This is ONE TIME OFFER

Enter your email address to receive your 50% off dicount code:

SPECIAL OFFER: GET 50% OFF

Use Discount Code:

EMC Exam Questions

Free EMC Study Guide

Category: E20-007

How should you proceed?

In which lifecycle stage are initial hypotheses formed?

Which type of join would you use for this table?

what is a way that you could try to increase the R2 of the model without artificially inflating it?

What is the sum of the probabilities that the model assigns to all the filers in your training set that have b

In which phase of the analytic lifecycle would you expect to spend most of the project time?

Since R factors are categorical variables, they are most closely related to which data classification level?

Which activity is performed in the Operationalize phase of the Data Analytics Lifecycle?

For which class of problem is MapReduce most suitable?

What does R code nv <- v[v < 1000] do?