Note: This question is part of a series of questions that use the same scenario. For your convenience, the
scenario is repeated in each question. Each question presents a different goal and answer choices, but the text
of the scenario is exactly the same in each question in this series.
You plan to create a predictive analytics solution for credit risk assessment and fraud prediction in Azure
Machine Learning. The Machine Learning workspace for the solution will be shared with other users in your
organization. You will add assets to projects and conduct experiments in the workspace.
The experiments will be used for training models that will be published to provide scoring from web services.
The experiment for fraud prediction will use Machine Learning modules and APIs to train the models and will
predict probabilities in an Apache Hadoop ecosystem.
You need to alter the list of columns that will be used for predicting fraud for an input web service endpoint. The
columns from the original data source must be retained while running the Machine Learning experiment.
Which module should you add after the web service input module and before the prediction module?
A.
Edit Metadata
B.
Import Data
C.
SMOTE
D.
Select Columns in Dataset
SMOTE
・Can be using to increase the number of underepresented cases in a dataset used for machine learning. SMOTE is a better way of increasing the number of rare cases than simply duplicating existing cases.
0
0
Correct answer is A – Edit Metadata.
Even if you remove the column and connect the input data to the Select Column model, the column will still be in the web input.In order to remove it you need to use SELECT COLUMN and then Edit Metadata(even if you don’t do anything there) and then connect the Web input to the Edit Metadata.
0
2
since the original columns need to retained (assume to be consumed into the model, otherwise the word retained is senseless), edit metadata seems to be the right choice. It said ” we need to alter the list of column,,,”
SMOTE is relevant for fraud detection (very low percentage) but is only used in taining, not web service/production.
2
1
Select Columns in Dataset is Correct.
Select Columns in Dataset module in Azure Machine Learning Studio, to choose a subset of columns to use in downstream operations. The module does not physically remove the columns from the source dataset; instead, it creates a subset of columns, much like a database view or projection.
Edit Metadata Module Features:
Typical metadata changes might include:
Treating Boolean or numeric columns as categorical values
Indicating which column contains the class label, or the values you want to categorize or predict
Marking columns as features
Changing date/time values to a numeric value, or vice versa
Renaming columns
Use Edit Metadata any time you need to modify the definition of a column, typically to meet requirements for a downstream module. For example, some modules can work only with specific data types, or require flags on the columns, such as IsFeature or IsCategorical.
After performing the required operation, you can reset the metadata to its original state.
1
0