★ Pass on Your First TRY ★ 100% Money Back Guarantee ★ Realistic Practice Exam Questions
CCA-505 Product Description:
Exam Number/Code: CCA-505 vce
Exam name: Cloudera Certified Administrator for Apache Hadoop (CCAH) CDH5 Upgrade Exam
n questions with full explanations
Certification: Cloudera Certification
Last updated on Global synchronizing
Proper study guides for CCA-505 Cloudera Certified Administrator for Apache Hadoop (CCAH) CDH5 Upgrade Exam certified begins with preparation products which designed to deliver the by making you pass the CCA-505 test at your first time. Try the free right now.
Online Cloudera CCA-505 free dumps demo Below:
NEW QUESTION 1
You observe that the number of spilled records from Map tasks far exceeds the number of map output records. Your child heap size is 1GB and your io.sort.mb value is set to 100 MB. How would you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio?
- A. Decrease the io.sort.mb value to 0
- B. Increase the io.sort.mb to 1GB
- C. For 1GB child heap size an io.sort.mb of 128 MB will always maximize memory to disk I/O
- D. Tune the io.sort.mb value until you observe that the number of spilled records equals (or is as close to equals) the number of map output records
NEW QUESTION 2
Your Hadoop cluster contains nodes in three racks. You have NOT configured the dfs.hosts property in the NameNode’s configuration file. What results?
- A. No new nodes can be added to the cluster until you specify them in the dfs.hosts file
- B. Presented with a blank dfs.hosts property, the NameNode will permit DatNode specified in mapred.hosts to join the cluster
- C. Any machine running the DataNode daemon can immediately join the cluster
- D. The NameNode will update the dfs.hosts property to include machine running DataNode daemon on the next NameNode reboot or with the command dfsadmin -refreshNodes
NEW QUESTION 3
You want to understand more about how users browse you public website. For example, you want to know which pages they visit prior to placing an order. You have a server farm of 200 web servers hosting your website. Which is the most efficient process to gather these web server logs into your Hadoop cluster for analysis?
- A. Sample the web server logs web servers and copy them into HDFS using curl
- B. Ingest the server web logs into HDFS using Flume
- C. Import all users clicks from your OLTP databases into Hadoop using Sqoop
- D. Write a MApReduce job with the web servers from mappers and the Hadoop cluster nodes reducers
- E. Channel these clickstream into Hadoop using Hadoop Streaming
NEW QUESTION 4
A slave node in your cluster has four 2TB hard drives installed (4 x 2TB). The DataNode is
configured to store HDFS blocks on the disks. You set the value of the dfs.datanode.du.reserved parameter to 100GB. How does this alter HDFS block storage?
- A. A maximum of 100 GB on each hard drive may be used to store HDFS blocks
- B. All hard drives may be used to store HDFS blocks as long as atleast 100 GB in total is available on the node
- C. 100 GB on each hard drive may not be used to store HDFS blocks
- D. 25 GB on each hard drive may not be used to store HDFS blocks
NEW QUESTION 5
You are configuring a cluster running HDFS, MapReduce version 2 (MRv2) on YARN running Linux. How must you format the underlying filesystem of each DataNode?
- A. They must not formatted - - HDFS will format the filesystem automatically
- B. They may be formatted in any Linux filesystem
- C. They must be formatted as HDFS
- D. They must be formatted as either ext3 or ext4
NEW QUESTION 6
You decide to create a cluster which runs HDFS in High Availability mode with automatic failover, using Quorum-based Storage. What is the purpose of ZooKeeper in such a configuration?
- A. It manages the Edits file, which is a log changes to the HDFS filesystem.
- B. It monitors an NFS mount point and reports if the mount point disappears
- C. It both keeps track of which NameNode is Active at any given time, and manages the Edits file, which is a log of changes to the HDFS filesystem
- D. It only keeps track of which NameNode is Active at any given time
- E. Clients connect to ZoneKeeper to determine which NameNode is Active
Explanation: Reference: http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/PDF/CDH4-High-Availability-Guide.pdf (page 15)
NEW QUESTION 7
You suspect that your NameNode is incorrectly configured, and is swapping memory to disk. Which Linux commands help you to identify whether swapping is occurring? (Select 3)
- A. free
- B. df
- C. memcat
- D. top
- E. vmstat
- F. swapinfo
NEW QUESTION 8
Your cluster implements HDFS High Availability (HA). Your two NameNodes are named nn01 and nn02. What occurs when you execute the command: hdfs haadmin –failover nn01 nn02
- A. nn02 becomes the standby NameNode and nn01 becomes the active NameNode
- B. nn02 is fenced, and nn01 becomes the active NameNode
- C. nn01 becomes the standby NamNode and nn02 becomes the active NAmeNode
- D. nn01 is fenced, and nn02 becomes the active NameNode
Explanation: failover – initiate a failover between two NameNodes
This subcommand causes a failover from the first provided NameNode to the second. If the first NameNode is in the Standby state, this command simply transitions the second to the Active state without error. If the first NameNode is in the Active state, an attempt will be made to gracefully transition it to the Standby state. If this fails, the fencing methods (as configured by dfs.ha.fencing.methods) will be attempted in order until one of the methods succeeds. Only after this process will the second NameNode be transitioned to the Active state. If no fencing method succeeds, the second NameNode will not be transitioned to the Active state, and an error will be returned.
NEW QUESTION 9
Which Yarn daemon or service monitors a Container’s per-application resource usage (e.g, memory, CPU)?
- A. NodeManager
- B. ApplicationMaster
- C. ApplicationManagerService
- D. ResourceManager
Explanation: Reference: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-22.214.171.124/bk_using-apache-hadoop/content/ch_using-apache-hadoop-4.html (4th para)
NEW QUESTION 10
Which three basic configuration parameters must you set to migrate your cluster from MapReduce1 (MRv1) to MapReduce v2 (MRv2)?
- A. Configure the NodeManager hostname and enable services on YARN by setting the following property in yarn-site.xml:<name>yarn.nodemanager.hostname</name><value>your_nodeManager_hostname</value>
- B. Configure the number of map tasks per job on YARN by setting the following property in mapred-site.xml:<name>mapreduce.job.maps</name><value>2</value>
- C. Configure MapReduce as a framework running on YARN by setting the following property in mapred-site.xml:<name>mapreduce.framework.name</name><value>yarn</value>
- D. Configure the ResourceManager hostname and enable node services on YARN by setting the following property in yarn-site.xml:<name>yarn.resourcemanager.hostname</name><value>your_responseManager_hostname</value>
- E. Configure a default scheduler to run on YARN by setting the following property in sapred-site.xml:<name>mapreduce.jobtracker.taskScheduler</name><value>org.apache.hadoop.mapred.JobQueueTaskScheduler</value>
- F. Configure the NodeManager to enable MapReduce services on YARN by adding following property in yarn-site.xml:<name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value>
NEW QUESTION 11
You are planning a Hadoop cluster and considering implementing 10 Gigabit Ethernet as the network fabric. Which workloads benefit the most from a faster network fabric?
- A. When your workload generates a large amount of output data, significantly larger than amount of intermediate data
- B. When your workload generates a large amount of intermediate data, on the order of the input data itself
- C. When workload consumers a large amount of input data, relative to the entire capacity of HDFS
- D. When your workload consists of processor-intensive tasks
NEW QUESTION 12
Which process instantiates user code, and executes map and reduce tasks on a cluster running MapReduce V2 (MRv2) on YARN?
- A. NodeManager
- B. ApplicationMaster
- C. ResourceManager
- D. TaskTracker
- E. JobTracker
- F. DataNode
- G. NameNode
NEW QUESTION 13
You have converted your Hadoop cluster from a MapReduce 1 (MRv1) architecture to a MapReduce 2 (MRv2) on YARN architecture. Your developers are accustomed to specifying map and reduce tasks (resource allocation) tasks when they run jobs. A developer wants to know how specify to reduce tasks when a specific job runs. Which method should you tell that developer to implement?
- A. Developers specify reduce tasks in the exact same way for both MapReduce version 1 (MRv1) and MapReduce version 2 (MRv2) on YAR
- B. Thus, executing –p mapreduce.job.reduce-2 will specify 2 reduce tasks.
- C. In YARN, the ApplicationMaster is responsible for requesting the resources required for a specific jo
- D. Thus, executing –p yarn.applicationmaster.reduce.tasks-2 will specify that the ApplicationMaster launch two task containers on the worker nodes.
- E. In YARN, resource allocation is a function of megabytes of memory in multiple of 1024m
- F. Thus, they should specify the amount of memory resource they need by executing –D mapreduce.reduce.memory-mp-2040
- G. In YARN, resource allocation is a function of virtual cores specified by the ApplicationMaster making requests to the NodeManager where a reduce task is handled by a single container (and this a single virtual core). Thus, the developer needs to specify the number of virtual cores to the NodeManager by executing –p yarn.nodemanager.cpu- vcores=2
- H. MapReduce version 2 (MRv2) on YARN abstracts resource allocation away from the idea of “tasks” into memory and virtual cores, thus eliminating the need for a developer to specify the number of reduce tasks, and indeed preventing the developer from specifying the number of reduce tasks.
NEW QUESTION 14
Your cluster’s mapped-site.xml includes the following parameters
And your cluster’s yarn-site.xml includes the following parameters
What is the maximum amount of virtual memory allocated for each map before YARN will kill its Container?
- A. 4 GB
- B. 17.2 GB
- C. 24.6 GB
- D. 8.2 GB
Explanation: ince map memory is 4gb and you are telling physical VS virtual is 2:1 one map can use only 8gb of virtual memory (swap) .. after that it would be killed.
NEW QUESTION 15
Assuming a cluster running HDFS, MapReduce version 2 (MRv2) on YARN with all settings at their default, what do you need to do when adding a new slave node to a cluster?
- A. Nothing, other than ensuring that DNS (or /etc/hosts files on all machines) contains am entry for the new node.
- B. Restart the NameNode and ResourceManager deamons and resubmit any running jobs
- C. Increase the value of dfs.number.of.needs in hdfs-site.xml
- D. Add a new entry to /etc/nodes on the NameNode host.
- E. Restart the NameNode daemon.
NEW QUESTION 16
For each YARN Job, the Hadoop framework generates task log files. Where are Hadoop’s files stored?
- A. In HDFS, In the directory of the user who generates the job
- B. On the local disk of the slave node running the task
- C. Cached In the YARN container running the task, then copied into HDFS on fob completion
- D. Cached by the NodeManager managing the job containers, then written to a log directory on the NameNode
Explanation: Reference: http://hortonworks.com/blog/simplifying-user-logs-management-and-access-in- yarn/
NEW QUESTION 17
You are upgrading a Hadoop cluster from HDFS and MapReduce version 1 (MRv1) to one running HDFS and MapReduce version 2 (MRv2) on YARN. You want to set and enforce a block of 128MB for all new files written to the cluster after the upgrade. What should you do?
- A. Set dfs.block.size to 128M on all the worker nodes, on all client machines, and on the NameNode, and set the parameter to final.
- B. Set dfs.block.size to 134217728 on all the worker nodes, on all client machines, and on the NameNode, and set the parameter to final.
- C. Set dfs.block.size to 134217728 on all the worker nodes and client machines, and set the parameter to fina
- D. You do need to set this value on the NameNode.
- E. Set dfs.block.size to 128M on all the worker nodes and client machines, and set the parameter to fina
- F. You do need to set this value on the NameNode.
- G. You cannot enforce this, since client code can always override this value.
100% Valid and Newest Version CCA-505 Questions & Answers shared by Certleader, Get Full Dumps HERE: https://www.certleader.com/CCA-505-dumps.html (New 45 Q&As)