Which module should you use?
Note: This question is part of a series of questions that use the same or similar answer choices. An answer
choice may be correct for more than one question in the series. Each question is independent of the other
questions in this series. Information and details provided in a question apply only to that question.
You have a dataset that contains a column named Column1. Column1 is empty.
You need to omit Column1 from the dataset. The solution must use a native module.
Which module should you use?
Which module should you use?
Note: This question is part of a series of questions that use the same or similar answer choices. An answerchoice may be correct for more than one question in the series. Each question is independent of the other
questions in this series. Information and details provided in a question apply only to that question.
You have a non-tabular file that is saved in Azure Blob storage.
You need to download the file locally, access the data in the file, and then format the data as a dataset.
Which module should you use?
Which module should you use?
Note: This question is part of a series of questions that use the same or similar answer choices. An answer
choice may be correct for more than one question in the series. Each question is independent of the other
questions in this series. Information and details provided in a question apply only to that question.
You need to remove rows that have an empty value in a specific column. The solution must use a native
module.
Which module should you use?
Which attribute should you remove?
Note: This question is part of a series of questions that use the same scenario. For your convenience, the
scenario is repeated in each question. Each question presents a different goal and answer choices, but the text
of the scenario is exactly the same in each question in this series.
A travel agency named Margie’s Travel sells airline tickets to customers in the United States.
Margie’s Travel wants you to provide insights and predictions on flight delays. The agency is considering
implementing a system that will communicate to its customers as the flight departure nears about possible
delays due to weather conditions. The flight data contains the following attributes:
DepartureDate: The departure date aggregated at a per hour granularity
Carrier: The code assigned by the IATA and commonly used to identify a carrier
OriginAitportID: An identification number assigned by the USDOT to identify a unique airport (the flight’s
origin)
DestAirportID: An identification number assigned by the USDOT to identify a unique airport (the flight’s
destination)
DepDel: The departure delay in minutes
DepDel30: A Boolean value indicating whether the departure was delayed by 30 minutes or more (a value of1 indicates that the departure was delayed by 30 minutes or more)
The weather data contains the following attributes: AirportID, ReadingDate (YYYY/MM/DD HH),
SkyConditionVisibility, WeatherType, WindSpeed, StationPressure, PressureChange, and HourlyPrecip.
You plan to predict flight delays that are 30 minutes or more.
You need to build a training model that accurately fits the data. The solution must minimize over fitting and
minimize data leakage.
Which attribute should you remove?
Which module should you use for each requirement?
DRAG DROP
Note: This question is part of a series of questions that use the same scenario. For your convenience, the
scenario is repeated in each question. Each question presents a different goal and answer choices, but the text
of the scenario is exactly the same in each question in this series.
A travel agency named Margie’s Travel sells airline tickets to customers in the United States.
Margie’s Travel wants you to provide insights and predictions on flight delays. The agency is considering
implementing a system that will communicate to its customers as the flight departure nears about possible
delays due to weather conditions. The flight data contains the following attributes:
DepartureDate: The departure date aggregated at a per hour granularity
Carrier: The code assigned by the IATA and commonly used to identify a carrier
OriginAitportID: An identification number assigned by the USDOT to identify a unique airport (the flight’s
origin)
DestAirportID: An identification number assigned by the USDOT to identify a unique airport (the flight’s
destination)
DepDel: The departure delay in minutes
DepDel30: A Boolean value indicating whether the departure was delayed by 30 minutes or more (a value of
1 indicates that the departure was delayed by 30 minutes or more)
The weather data contains the following attributes: AirportID, ReadingDate (YYYY/MM/DD HH),
SkyConditionVisibility, WeatherType, WindSpeed, StationPressure, PressureChange, and HourlyPrecip.
You need to remove the bias and to identify the columns in the input dataset that have the greatest predictive
power.
Which module should you use for each requirement? To answer, drag the appropriate modules to the correctrequirements. Each module may be used once, more than once, or not at all. You may need to drag the split
bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:
Which module should you use?
Note: This question is part of a series of questions that use the same scenario. For your convenience, the
scenario is repeated in each question. Each question presents a different goal and answer choices, but the text
of the scenario is exactly the same in each question in this series.A travel agency named Margie’s Travel sells airline tickets to customers in the United States.
Margie’s Travel wants you to provide insights and predictions on flight delays. The agency is considering
implementing a system that will communicate to its customers as the flight departure nears about possible
delays due to weather conditions. The flight data contains the following attributes:
DepartureDate: The departure date aggregated at a per hour granularity
Carrier: The code assigned by the IATA and commonly used to identify a carrier
OriginAitportID: An identification number assigned by the USDOT to identify a unique airport (the flight’s
origin)
DestAirportID: An identification number assigned by the USDOT to identify a unique airport (the flight’s
destination)
DepDel: The departure delay in minutes
DepDel30: A Boolean value indicating whether the departure was delayed by 30 minutes or more (a value of
1 indicates that the departure was delayed by 30 minutes or more)
The weather data contains the following attributes: AirportID, ReadingDate (YYYY/MM/DD HH),
SkyConditionVisibility, WeatherType, WindSpeed, StationPressure, PressureChange, and HourlyPrecip.
You have an untrained Azure Machine Learning model that you plan to train to predict flight delays.
You need to assess the variability of the dataset and the reliability of the predictions from the model.
Which module should you use?
How should you complete the R code?
DRAG DROP
You have an Execute R Script module that has one input from either a Partition and Sample module or a Web
service input module.
You need to preprocess tweets by using R. The solution must meet the following requirements:
Remove digits.
Remove punctuation.
Convert to lowercase.
How should you complete the R code? To answer, drag the appropriate values to the correct targets. Each
value may be used once, more than once, or not at all. You may need to drag the split bar panes or scroll to
view content.
NOTE: Each correct selection is worth one point.
Select and Place:
Which method should you use?
Note: This question is part of a series of questions that use the same scenario. For your convenience, the
scenario is repeated in each question. Each question presents a different goal and answer choices, but the text
of the scenario is exactly the same in each question in this series.
A travel agency named Margie’s Travel sells airline tickets to customers in the United States.
Margie’s Travel wants you to provide insights and predictions on flight delays. The agency is considering
implementing a system that will communicate to its customers as the flight departure nears about possible
delays due to weather conditions. The flight data contains the following attributes:
DepartureDate: The departure date aggregated at a per hour granularity
Carrier: The code assigned by the IATA and commonly used to identify a carrier
OriginAirportID: An identification number assigned by the USDOT to identify a unique airport (the flight’sorigin)
DestAirportID: An identification number assigned by the USDOT to identify a unique airport (the flight’s
destination)
DepDel: The departure delay in minutes
DepDel30: A Boolean value indicating whether the departure was delayed by 30 minutes or more (a value of
1 indicates that the departure was delayed by 30 minutes or more)
The weather data contains the following attributes: AirportID, ReadingDate (YYYY/MM/DD HH),
SkyConditionVisibility, WeatherType, WindSpeed, StationPressure, PressureChange, and HourlyPrecip.
You need to use historical data about on-time flight performance and the weather data to predict whether the
departure of a scheduled flight will be delayed by more than 30 minutes.
Which method should you use?
How many input parameters should you specify?
You have the following three training datasets for a restaurant:
User features
Item features
Ratings of items by users
You must recommend restaurant to a particular user based only on the users features.
You need to use a Matchbox Recommender to make recommendations.
How many input parameters should you specify?
Which Matchbox recommender should you use?
You have data about the following:
Users
Movies
User ratings of the movies
You need to predict whether a user will like a particular movie.
Which Matchbox recommender should you use?