NVIDIA NCP-AIO Exam - Questions and Answers

Question 1

Which of the following correctly identifies the key components of a Kubernetes cluster and their roles?

A. The control plane consists of the kube-apiserver, etcd, kube-scheduler, and kube-controller-manager, while worker nodes run kubelet and kube-proxy.
B. Worker nodes manage the kube-apiserver and etcd, while the control plane handles all container runtimes.
C. The control plane is responsible for running all application containers, while worker nodes manage network traffic through etcd.
D. The control plane includes the kubelet and kube-proxy, and worker nodes are responsible for running etcd and the scheduler.

Answer : A

Question 2

A DGX H100 system in a cluster is showing performance issues when running jobs.
Which command should be run to generate system logs related to the health report?

A. nvsm show logs --save
B. nvsm get logs
C. nvsm dump health
D. nvsm health --dump-log

Answer : C

Question 3

You are managing a Kubernetes cluster running AI training jobs using TensorFlow. The jobs require access to multiple GPUs across different nodes, but inter-node communication seems slow, impacting performance.
What is a potential networking configuration would you implement to optimize inter-node communication for distributed training?

A. Increase the number of replicas for each job to reduce the load on individual nodes.
B. Use standard Ethernet networking with jumbo frames enabled to reduce packet overhead during communication.
C. Configure a dedicated storage network to handle data transfer between nodes during training.
D. Use InfiniBand networking between nodes to reduce latency and increase throughput for distributed training jobs.

Answer : D

Question 4

A system administrator needs to scale a Kubernetes Job to 4 replicas.
What command should be used?

A. kubectl stretch job --replicas=4
B. kubectl autoscale deployment job --min=1 --max=10
C. kubectl scale job --replicas=4
D. kubectl scale job -r 4

Answer : C

Question 5

An administrator wants to check if the BlueMan service can access the DPU.
How can this be done?

A. Via system logs
B. Via the DOCA Telemetry Service (DTS)
C. Via a lightweight database operating in the DPU server
D. Via Linux dump files

Answer : B

Question 6

After completing the installation of a Kubernetes cluster on your NVIDIA DGX systems using BCM, how can you verify that all worker nodes are properly registered and ready?

A. Run kubectl get nodes to verify that all worker nodes show a status of “Ready”.
B. Run kubectl get pods to check if all worker pods are running as expected.
C. Check each node manually by logging in via SSH and verifying system status with systemctl.

Answer : A

Question 7

An administrator is troubleshooting Issues with NVIDIA GPUDirect storage and must ensure optimal data transfer performance.
What step should be taken first?

A. Increase the GPU's core clock frequency.
B. Upgrade the CPU to a higher clock speed.
C. Check for compatible RDMA-capable network hardware and configurations.
D. Install additional GPU memory (VRAM).

Answer : C

Question 8

A cloud engineer is looking to deploy a digital fingerprinting pipeline using NVIDIA Morpheus and the NVIDIA AI Enterprise Virtual Machine Image (VMI).
Where would the cloud engineer find the VMI?

A. Github and Dockerhub
B. Azure, Google, Amazon Marketplaces
C. NVIDIA NGC
D. Developer Forums

Answer : B

Question 9

An administrator is troubleshooting a bottleneck in a deep learning run time and needs consistent data feed rates to GPUs.
Which storage metric should be used?

A. Disk I/O operations per second (IOPS)
B. Disk free space
C. Sequential read speed
D. Disc utilization in performance manager

Answer : C

Question 10

You have successfully pulled a TensorFlow container from NGC and now need to run it on your stand-alone GPU-enabled server.
Which command should you use to ensure that the container has access to all available GPUs?

A. kubectl create pod --gpu=all nvcr.io/nvidia/tensorflow:<tag>
B. docker run nvcr.io/nvidia/tensorflow:<tag>
C. docker start nvcr.io/nvidia/tensorflow:<tag>
D. docker run --gpus all nvcr.io/nvidia/tensorflow:<tag>

Answer : D

Question 11

You are tasked with deploying a DOCA service on an NVIDIA BlueField DPU in an air-gapped data center environment. The DPU has the required BlueField OS version (3.9.0 or higher) installed, and you have access to the necessary container Image from NVIDIA'S NGC catalog. However, you need to ensure that the deployment process is successful without an internet connection.
Which of the following steps should you take to deploy the DOCA service on the DPU?

A. Install Docker on the DPU, pull the container directly from NGC, and run it using ‘docker run’ with appropriate environment variables.
B. Pull the container image from NGC using Docker and modify the YAML file before deployment.
C. Manually download the container image and YAML file beforehand, transfer them to the DPU, and deploy using Kubernetes with standalone Kubelet.
D. Use the host system’s Docker engine to pull the container image and deploy it on the DPU via SSH.

Answer : C

Question 12

A system administrator wants to run these two commands in Base Command Manager. main showprofiledevice status apc01.
What command should the system administrator use from the management node system shell?

A. cmsh -c “main showprofile; device status apc01”
B. cmsh -p “main showprofile; device status apc01”
C. system -c “main showprofile; device status apc01”
D. cmsh-system -c “main showprofile; device status apc01”

Answer : A

Question 13

You are setting up a Kubernetes cluster on NVIDIA DGX systems using BCM, and you need to initialize the control-plane nodes.
What is the most important step to take before initializing these nodes?

A. Set up a load balancer before initializing any control-plane node.
B. Disable swap on all control-plane nodes before initializing them.
C. Ensure that Docker is Installed and running on all control-plane nodes.
D. Configure each control-plane node with its own external IP address.

Answer : B

Question 14

An administrator requires full access to the NGC Base Command Platform CLI.
Which command should be used to accomplish this action?

A. ngc set API
B. ngc config set
C. ngc config BCP

Answer : B

Question 15

A cloud engineer is looking to provision a virtual machine for machine learning using the NVIDIA Virtual Machine Image (VMI) and Rapids.
What technology stack will be set up for the development team automatically when the VMI is deployed?

A. Ubuntu Server, Docker-CE, NVIDIA Container Toolkit, CSP CLI, NGC CLI, NVIDIA Driver
B. Cent OS, Docker-CE, NVIDIA Container Toolkit, CSP CLI, NGC CLI
C. Ubuntu Server, Docker-CE, NVIDIA Container Toolkit, CSP CLI, NGC CLI, NVIDIA Driver, Rapids
D. Ubuntu Server, Docker-CE, NVIDIA Container Toolkit, CSP CLI, NGC CLI

Answer : C

NCP - AI Operations v1.0

Question 1

Question 2

Question 3

Question 4

Question 5

Question 6

Question 7

Question 8

Question 9

Question 10

Question 11

Question 12

Question 13

Question 14

Question 15

Talk to us!