PrepAway - Latest Free Exam Questions & Answers

Category: Professional Data Engineer

Exam Professional Data Engineer

Your company is migrating their 30-node Apache Hadoop cluster to the cloud. They want to re-use Hadoop jobs th

Your company is migrating their 30-node Apache Hadoop cluster to the cloud. They want to re-use Hadoop jobs they have already created and minimize the management of the cluster as much as possible. They also want to be able to persist data beyond the life of the cluster. What should you do? A. Create a […]

Your startup has never implemented a formal security policy. Currently, everyone in the company has access to

Your startup has never implemented a formal security policy. Currently, everyone in the company has access to the datasets stored in Google BigQuery. Teams have freedom to use the service as they see fit, and they have not documented their use cases. You have been asked to secure the data warehouse. You need to discover […]

You need to store and analyze social media postings in Google BigQuery at a rate of 10,000 messages per minute

You need to store and analyze social media postings in Google BigQuery at a rate of 10,000 messages per minute in near real-time. Initially, design the application to use streaming inserts for individual postings. Your application also performs data aggregations right after the streaming inserts. You discover that the queries after streaming inserts do not […]

You want to use a database of information about tissue samples to classify future tissue samples as either nor

You want to use a database of information about tissue samples to classify future tissue samples as either normal or mutated. You are evaluating an unsupervised anomaly detection method for classifying the tissue samples. Which two characteristic support this method? (Choose two.) A. There are very few occurrences of mutations relative to normal samples. B. […]

Your company handles data processing for a number of different clients. Each client prefers to use their own s

Your company handles data processing for a number of different clients. Each client prefers to use their own suite of analytics tools, with some allowing direct query access via Google BigQuery. You need to secure the data so that clients cannot see each other’s data. You want to ensure appropriate access to the data. Which […]

You use Google Cloud Dataflow to process the data and decide if a message should be sent. How should you desig

You are designing a basket abandonment system for an ecommerce company. The system will send a message to a user based on these rules: • No interaction by the user on the site for 1 hour • Has added more than $30 worth of products to the basket • Has not completed a transaction You […]

Your company is in a highly regulated industry. One of your requirements is to ensure individual users have ac

Your company is in a highly regulated industry. One of your requirements is to ensure individual users have access only to the minimum amount of information required to do their jobs. You want to enforce this requirement with Google BigQuery. Which three approaches can you take? (Choose three.) A. Disable writes to certain tables. B. […]

Which table name will make the SQL statement work correctly?

Your company is using WILDCARD tables to query data across multiple tables with similar names. The SQL statement is currently failing with the following error: # Syntax error : Expected end of statement but got “-“ at [4:11] SELECT age FROM bigquery-public-data.noaa_gsod.gsod WHERE age != 99 AND_TABLE_SUFFIX = ‘1929’ ORDER BY age DESC Which table […]

You are building new real-time data warehouse for your company and will use Google BigQuery streaming inserts.

You are building new real-time data warehouse for your company and will use Google BigQuery streaming inserts. There is no guarantee that data will only be sent in once but you do have a unique ID for each row of data and an event timestamp. You want to ensure that duplicates are not included while […]


Page 1 of 212