[Jan 02, 2024] Google Professional-Machine-Learning-Engineer Real Exam Questions and Answers FREE
Pass Google Professional-Machine-Learning-Engineer Exam Info and Free Practice Test
To be eligible for the Google Professional-Machine-Learning-Engineer Certification Exam, you need to have at least three years of experience in developing and deploying machine learning models on Google Cloud Platform or a similar platform. You should also have experience in programming languages such as Python, Java, or C++, and have a good understanding of machine learning concepts such as supervised and unsupervised learning, deep learning, and reinforcement learning.
Google Professional Machine Learning Engineer certification is a highly respected and sought-after certification in the field of machine learning. Google Professional Machine Learning Engineer certification is designed to validate the skills and expertise of professionals who are responsible for designing, building, managing, and deploying machine learning models at scale using Google Cloud technologies. Google Professional Machine Learning Engineer certification exam covers a wide range of topics related to machine learning, and candidates must have a minimum of three years of experience in the field of machine learning to be eligible for the exam.
NEW QUESTION # 86
You built and manage a production system that is responsible for predicting sales numbers. Model accuracy is crucial, because the production model is required to keep up with market changes. Since being deployed to production, the model hasn't changed; however the accuracy of the model has steadily deteriorated. What issue is most likely causing the steady decline in model accuracy?
- A. Poor data quality
- B. Incorrect data split ratio during model training, evaluation, validation, and test
- C. Lack of model retraining
- D. Too few layers in the model for capturing information
Answer: B
NEW QUESTION # 87
You have written unit tests for a Kubeflow Pipeline that require custom libraries. You want to automate the execution of unit tests with each new push to your development branch in Cloud Source Repositories. What should you do?
- A. Set up a Cloud Logging sink to a Pub/Sub topic that captures interactions with Cloud Source Repositories. Execute the unit tests using a Cloud Function that is triggered when messages are sent to the Pub/Sub topic
- B. Using Cloud Build, set an automated trigger to execute the unit tests when changes are pushed to your development branch.
- C. Write a script that sequentially performs the push to your development branch and executes the unit tests on Cloud Run
- D. Set up a Cloud Logging sink to a Pub/Sub topic that captures interactions with Cloud Source Repositories Configure a Pub/Sub trigger for Cloud Run, and execute the unit tests on Cloud Run.
Answer: B
Explanation:
https://cloud.google.com/architecture/architecture-for-mlops-using-tfx-kubeflow-pipelines-and-cloud-build#cicd_architecture
NEW QUESTION # 88
You work on a data science team at a bank and are creating an ML model to predict loan default risk. You have collected and cleaned hundreds of millions of records worth of training data in a BigQuery table, and you now want to develop and compare multiple models on this data using TensorFlow and Vertex AI. You want to minimize any bottlenecks during the data ingestion state while considering scalability. What should you do?
- A. Export data to CSV files in Cloud Storage, and use tf.data.TextLineDataset() to read them.
- B. Convert the data into TFRecords, and use tf.data.TFRecordDataset() to read them.
- C. Use the BigQuery client library to load data into a dataframe, and use tf.data.Dataset.from_tensor_slices() to read it.
- D. Use TensorFlow I/O's BigQuery Reader to directly read the data.
Answer: A
NEW QUESTION # 89
You work for an advertising company and want to understand the effectiveness of your company's latest advertising campaign. You have streamed 500 MB of campaign data into BigQuery. You want to query the table, and then manipulate the results of that query with a pandas dataframe in an Al Platform notebook. What should you do?
- A. From a bash cell in your Al Platform notebook, use the bq extract command to export the table as a CSV file to Cloud Storage, and then use gsutii cp to copy the data into the notebook Use pandas. read_csv to ingest the file as a pandas dataframe
- B. Download your table from BigQuery as a local CSV file, and upload it to your Al Platform notebook instance Use pandas. read_csv to ingest the file as a pandas dataframe
- C. Use Al Platform Notebooks' BigQuery cell magic to query the data, and ingest the results as a pandas dataframe
- D. Export your table as a CSV file from BigQuery to Google Drive, and use the Google Drive API to ingest the file into your notebook instance
Answer: D
NEW QUESTION # 90
You are building an ML model to detect anomalies in real-time sensor dat a. You will use Pub/Sub to handle incoming requests. You want to store the results for analytics and visualization. How should you configure the pipeline?
- A. 1 = BigQuery, 2 = Al Platform, 3 = Cloud Storage
- B. 1 = DataProc, 2 = AutoML, 3 = Cloud Bigtable
- C. 1 = BigQuery, 2 = AutoML, 3 = Cloud Functions
- D. 1 = Dataflow, 2 - Al Platform, 3 = BigQuery
Answer: C
NEW QUESTION # 91
You work for a gaming company that has millions of customers around the world. All games offer a chat feature that allows players to communicate with each other in real time. Messages can be typed in more than 20 languages and are translated in real time using the Cloud Translation API. You have been asked to build an ML system to moderate the chat in real time while assuring that the performance is uniform across the various languages and without changing the serving infrastructure.
You trained your first model using an in-house word2vec model for embedding the chat messages translated by the Cloud Translation API. However, the model has significant differences in performance across the different languages. How should you improve it?
- A. Train a classifier using the chat messages in their original language.
- B. Replace the in-house word2vec with GPT-3 or T5.
- C. Remove moderation for languages for which the false positive rate is too high.
- D. Add a regularization term such as the Min-Diff algorithm to the loss function.
Answer: C
NEW QUESTION # 92
An online reseller has a large, multi-column dataset with one column missing 30% of its data. A Machine Learning Specialist believes that certain columns in the dataset could be used to reconstruct the missing data.
Which reconstruction approach should the Specialist use to preserve the integrity of the dataset?
- A. Last observation carried forward
- B. Multiple imputation
- C. Mean substitution
- D. Listwise deletion
Answer: B
Explanation:
Explanation/Reference: https://worldwidescience.org/topicpages/i/imputing+missing+values.html
NEW QUESTION # 93
Your team is building an application for a global bank that will be used by millions of customers. You built a forecasting model that predicts customers1 account balances 3 days in the future. Your team will use the results in a new feature that will notify users when their account balance is likely to drop below $25. How should you serve your predictions?
- A. 1 Build a notification system on Firebase
2. Register each user with a user ID on the Firebase Cloud Messaging server, which sends a notification when your model predicts that a user's account balance will drop below the $25 threshold - B. 1. Create a Pub/Sub topic for each user
2. Deploy an application on the App Engine standard environment that sends a notification when your model predicts that a user's account balance will drop below the $25 threshold - C. 1. Create a Pub/Sub topic for each user
2 Deploy a Cloud Function that sends a notification when your model predicts that a user's account balance will drop below the $25 threshold. - D. 1. Build a notification system on Firebase
2. Register each user with a user ID on the Firebase Cloud Messaging server, which sends a notification when the average of all account balance predictions drops below the $25 threshold
Answer: B
NEW QUESTION # 94
You need to train a computer vision model that predicts the type of government ID present in a given image using a GPU-powered virtual machine on Compute Engine. You use the following parameters:
* Optimizer: SGD
* Image shape = 224x224
* Batch size = 64
* Epochs = 10
* Verbose = 2
During training you encounter the following error: ResourceExhaustedError: out of Memory (oom) when allocating tensor. What should you do?
- A. Change the optimizer
- B. Change the learning rate
- C. Reduce the image shape
- D. Reduce the batch size
Answer: A
NEW QUESTION # 95
You work on a growing team of more than 50 data scientists who all use Al Platform. You are designing a strategy to organize your jobs, models, and versions in a clean and scalable way. Which strategy should you choose?
- A. Separate each data scientist's work into a different project to ensure that the jobs, models, and versions created by each data scientist are accessible only to that user.
- B. Set up a BigQuery sink for Cloud Logging logs that is appropriately filtered to capture information about Al Platform resource usage In BigQuery create a SQL view that maps users to the resources they are using.
- C. Use labels to organize resources into descriptive categories. Apply a label to each created resource so that users can filter the results by label when viewing or monitoring the resources
- D. Set up restrictive I AM permissions on the Al Platform notebooks so that only a single user or group can access a given instance.
Answer: A
NEW QUESTION # 96
A Data Scientist needs to migrate an existing on-premises ETL process to the cloud. The current process runs at regular time intervals and uses PySpark to combine and format multiple large data sources into a single consolidated output for downstream processing.
The Data Scientist has been given the following requirements to the cloud solution:
* Combine multiple data sources.
* Reuse existing PySpark logic.
* Run the solution on the existing schedule.
* Minimize the number of servers that will need to be managed.
Which architecture should the Data Scientist use to build this solution?
- A. Write the raw data to Amazon S3. Create an AWS Glue ETL job to perform the ETL processing against the input data. Write the ETL job in PySpark to leverage the existing logic. Create a new AWS Glue trigger to trigger the ETL job based on the existing schedule. Configure the output target of the ETL job to write to a
"processed" location in Amazon S3 that is accessible for downstream use. - B. Write the raw data to Amazon S3. Schedule an AWS Lambda function to run on the existing schedule and process the input data from Amazon S3. Write the Lambda logic in Python and implement the existing PySpark logic to perform the ETL process. Have the Lambda function output the results to a "processed" location in Amazon S3 that is accessible for downstream use.
- C. Write the raw data to Amazon S3. Schedule an AWS Lambda function to submit a Spark step to a persistent Amazon EMR cluster based on the existing schedule. Use the existing PySpark logic to run the ETL job on the EMR cluster. Output the results to a "processed" location in Amazon S3 that is accessible for downstream use.
- D. Use Amazon Kinesis Data Analytics to stream the input data and perform real-time SQL queries against the stream to carry out the required transformations within the stream. Deliver the output results to a
"processed" location in Amazon S3 that is accessible for downstream use.
Answer: D
Explanation:
Explanation
NEW QUESTION # 97
Your team is working on an NLP research project to predict political affiliation of authors based on articles they have written. You have a large training dataset that is structured like this:
A)
B)
C)
D)
- A. Option D
- B. Option A
- C. Option C
- D. Option B
Answer: D
Explanation:
If we just put inside the Training set , Validation set and Test set , randomly Text, Paragraph or sentences the model will have the ability to learn specific qualities about The Author's use of language beyond just his own articles. Therefore the model will mixed up different opinions. Rather if we divided things up a the author level, so that given authors were only on the training data, or only in the test data or only in the validation data. The model will find more difficult to get a high accuracy on the test validation (What is correct and have more sense!). Because it will need to really focus in author by author articles rather than get a single political affiliation based on a bunch of mixed articles from different authors. https://developers.google.com/machine-learning/crash-course/18th-century-literature For example, suppose you are training a model with purchase data from a number of stores. You know, however, that the model will be used primarily to make predictions for stores that are not in the training data. To ensure that the model can generalize to unseen stores, you should segregate your data sets by stores. In other words, your test set should include only stores different from the evaluation set, and the evaluation set should include only stores different from the training set. https://cloud.google.com/automl-tables/docs/prepare#ml-use
NEW QUESTION # 98
You are developing models to classify customer support emails. You created models with TensorFlow Estimators using small datasets on your on-premises system, but you now need to train the models using large datasets to ensure high performance. You will port your models to Google Cloud and want to minimize code refactoring and infrastructure overhead for easier migration from on-prem to cloud. What should you do?
- A. Create a cluster on Dataproc for training
- B. Create a Managed Instance Group with autoscaling
- C. Use Kubeflow Pipelines to train on a Google Kubernetes Engine cluster.
- D. Use Al Platform for distributed training
Answer: D
Explanation:
AI platform also contains kubeflow pipelines. you don't need to set up infrastructure to use it. For D you need to set up a kubernetes cluster engine. The question asks us to minimize infrastructure overheard.
NEW QUESTION # 99
You work for a gaming company that manages a popular online multiplayer game where teams with 6 players play against each other in 5-minute battles. There are many new players every day. You need to build a model that automatically assigns available players to teams in real time. User research indicates that the game is more enjoyable when battles have players with similar skill levels. Which business metrics should you track to measure your model's performance? (Choose One Correct Answer)
- A. Rate of return as measured by additional revenue generated minus the cost of developing a new model
- B. User engagement as measured by the number of battles played daily per user
- C. Precision and recall of assigning players to teams based on their predicted versus actual ability
- D. Average time players wait before being assigned to a team
Answer: B
NEW QUESTION # 100
A Data Science team within a large company uses Amazon SageMaker notebooks to access data stored in Amazon S3 buckets. The IT Security team is concerned that internet-enabled notebook instances create a security vulnerability where malicious code running on the instances could compromise data privacy. The company mandates that all instances stay within a secured VPC with no internet access, and data communication traffic must stay within the AWS network.
How should the Data Science team configure the notebook instance placement to meet these requirements?
- A. Associate the Amazon SageMaker notebook with a private subnet in a VPC. Ensure the VPC has a NAT gateway and an associated security group allowing only outbound connections to Amazon S3 and Amazon SageMaker.
- B. Associate the Amazon SageMaker notebook with a private subnet in a VPC. Place the Amazon SageMaker endpoint and S3 buckets within the same VPC.
- C. Associate the Amazon SageMaker notebook with a private subnet in a VPC. Ensure the VPC has S3 VPC endpoints and Amazon SageMaker VPC endpoints attached to it.
- D. Associate the Amazon SageMaker notebook with a private subnet in a VPC. Use IAM policies to grant access to Amazon S3 and Amazon SageMaker.
Answer: A
NEW QUESTION # 101
Your organization wants to make its internal shuttle service route more efficient. The shuttles currently stop at all pick-up points across the city every 30 minutes between 7 am and 10 am. The development team has already built an application on Google Kubernetes Engine that requires users to confirm their presence and shuttle station one day in advance. What approach should you take?
- A. 1. Build a tree-based regression model that predicts how many passengers will be picked up at each shuttle station.
2. Dispatch an appropriately sized shuttle and provide the map with the required stops based on the prediction. - B. 1. Define the optimal route as the shortest route that passes by all shuttle stations with confirmed attendance at the given time under capacity constraints.
2 Dispatch an appropriately sized shuttle and indicate the required stops on the map - C. 1. Build a tree-based classification model that predicts whether the shuttle should pick up passengers at each shuttle station.
2. Dispatch an available shuttle and provide the map with the required stops based on the prediction - D. 1. Build a reinforcement learning model with tree-based classification models that predict the presence of passengers at shuttle stops as agents and a reward function around a distance-based metric
2. Dispatch an appropriately sized shuttle and provide the map with the required stops based on the simulated outcome.
Answer: D
NEW QUESTION # 102
A Machine Learning Specialist uploads a dataset to an Amazon S3 bucket protected with server-side encryption using AWS KMS.
How should the ML Specialist define the Amazon SageMaker notebook instance so it can read the same dataset from Amazon S3?
- A. Assign the same KMS key used to encrypt data in Amazon S3 to the Amazon SageMaker notebook instance.
- B. Сonfigure the Amazon SageMaker notebook instance to have access to the VPC. Grant permission in the KMS key policy to the notebook's KMS role.
- C. Assign an IAM role to the Amazon SageMaker notebook with S3 read access to the dataset. Grant permission in the KMS key policy to that role.
- D. Define security group(s) to allow all HTTP inbound/outbound traffic and assign those security group(s) to the Amazon SageMaker notebook instance.
Answer: A
Explanation:
Explanation/Reference: https://docs.aws.amazon.com/sagemaker/latest/dg/encryption-at-rest.html
NEW QUESTION # 103
You work for a bank and are building a random forest model for fraud detection. You have a dataset that includes transactions, of which 1% are identified as fraudulent.
Which data transformation strategy would likely improve the performance of your classifier?
- A. Oversample the fraudulent transaction 10 times.
- B. Use one-hot encoding on all categorical features.
- C. Z-normalize all the numeric features.
- D. Write your data in TFRecords.
Answer: A
NEW QUESTION # 104
You work for the AI team of an automobile company, and you are developing a visual defect detection model using TensorFlow and Keras. To improve your model performance, you want to incorporate some image augmentation functions such as translation, cropping, and contrast tweaking. You randomly apply these functions to each training batch. You want to optimize your data processing pipeline for run time and compute resources utilization. What should you do?
- A. Embed the augmentation functions dynamically as part of Keras generators.
- B. Use Dataflow to create all possible augmentations, and store them as TFRecords.
- C. Use Dataflow to create the augmentations dynamically per training run, and stage them as TFRecords.
- D. Embed the augmentation functions dynamically in the tf.Data pipeline.
Answer: B
NEW QUESTION # 105
A Machine Learning Specialist is using an Amazon SageMaker notebook instance in a private subnet of a corporate VPC. The ML Specialist has important data stored on the Amazon SageMaker notebook instance's Amazon EBS volume, and needs to take a snapshot of that EBS volume. However, the ML Specialist cannot find the Amazon SageMaker notebook instance's EBS volume or Amazon EC2 instance within the VPC.
Why is the ML Specialist not seeing the instance visible in the VPC?
- A. Amazon SageMaker notebook instances are based on EC2 instances running within AWS service accounts.
- B. Amazon SageMaker notebook instances are based on the Amazon ECS service within customer accounts.
- C. Amazon SageMaker notebook instances are based on AWS ECS instances running within AWS service accounts.
- D. Amazon SageMaker notebook instances are based on the EC2 instances within the customer account, but they run outside of VPCs.
Answer: A
Explanation:
Explanation/Reference: https://docs.aws.amazon.com/sagemaker/latest/dg/gs-setup-working-env.html
NEW QUESTION # 106
You are training a Resnet model on Al Platform using TPUs to visually categorize types of defects in automobile engines. You capture the training profile using the Cloud TPU profiler plugin and observe that it is highly input-bound. You want to reduce the bottleneck and speed up your model training process. Which modifications should you make to the tf .data dataset?
Choose 2 answers
- A. Decrease the batch size argument in your transformation
- B. Use the interleave option for reading data
- C. Set the prefetch option equal to the training batch size
- D. Increase the buffer size for the shuffle option.
- E. Reduce the value of the repeat parameter
Answer: B,C
NEW QUESTION # 107
......
Latest Professional-Machine-Learning-Engineer Exam Dumps Google Exam: https://www.actualtestsquiz.com/Professional-Machine-Learning-Engineer-test-torrent.html
New 2024 Latest Questions Professional-Machine-Learning-Engineer Dumps - Use Updated Google Exam: https://drive.google.com/open?id=1yPSOvcLLnM8wYemnsXzxeCH-dEvB8khs

