Which pattern of data will indicate that the trend in the time series data is quadratic in nature?
On analyzing your time series data you suspect that the data represented as
y1, y2, y3, … , yn-1, yn
may have a trend component that is quadratic in nature. Which pattern of data will indicate that
the trend in the time series data is quadratic in nature?
Which analytical method is considered unsupervised?
Which analytical method is considered unsupervised?
What should you do?
You have used k-means clustering to classify behavior of 100, 000 customers for a retail store.
You decide to use household income, age, gender and yearly purchase amount as measures. You
have chosen to use 8 clusters and notice that 2 clusters only have 3 customers assigned. What
should you do?
What does R code nv <- v[v < 1000] do?
What does R code nv <- v[v < 1000] do?
which class of problem is MapReduce most suitable?
For which class of problem is MapReduce most suitable?
Which activity is performed in the Operationalize phase of the Data Analytics Lifecycle?
Which activity is performed in the Operationalize phase of the Data Analytics Lifecycle?
which data classification level?
Since R factors are categorical variables, they are most closely related to which data classification
level?
which phase of the analytic lifecycle would you expect to spend most of the project time?
In which phase of the analytic lifecycle would you expect to spend most of the project time?
What is the sum of the probabilities that the model assigns to all the filers in your training set that have b
You are building a logistic regression model to predict whether a tax filer will be audited within the
next two years. Your training set population is 1000 filers. The audit rate in your training data is
4.2%. What is the sum of the probabilities that the model assigns to all the filers in your training set
that have been audited?
what is a way that you could try to increase the R2 of the model without artificially inflating it?
Refer to exhibit.
You are asked to write a report on how specific variables impact your client’s sales using a data
set provided to you by the client. The data includes 15 variables that the client views as directly
related to sales, and you are restricted to these variables only.
After a preliminary analysis of the data, the following findings were made:
1. Multicollinearity is not an issue among the variables
2. Only three variables—A, B, and C—have significant correlation with sales
You build a linear regression model on the dependent variable of sales with the independent
variables of A, B, and C. The results of the regression are seen in the exhibit.
You cannot request additional datA. what is a way that you could try to increase the R2 of the
model without artificially inflating it?