IBM Big Data Engineer v1.0

Page:    1 / 4   
Exam contains 53 questions

Extracting structured data from various database into a "sandbox" location without writing code can be performed using which tool include with BigInsights?

  • A. Flume
  • B. Data Click
  • C. DataStage
  • D. Big SQL Load


Answer : A

What is Flume?

  • A. A distributed filesystem
  • B. A platform for executing MapReduce jobs
  • C. A programming language that translates high-level queries into map tasks and reduce tasks
  • D. A service for moving large amounts of data around a cluster soon after the data is produced.


Answer : D

Reference: https://www.ibm.com/support/knowledgecenter/en/SSPT3X_4.1.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/bi_flume.html

What are the available document formats beside PDF and MS Word when export a redacted document using Optim Review Tool?

  • A. TIFF, and CSF
  • B. TIFF, and PNG
  • C. JPEG, and PNG
  • D. Plain Text, and CSV


Answer : A

Reference: https://www-01.ibm.com/software/info/channel-solution-profiles/Information_Graphics_Redact-It.html

PCI compliance requirements allow the use of real customer data during testing and development only when:

  • A. The data is only processed in memory
  • B. Customer data is never allowed in testing or development
  • C. The data is never stored longer than 24 hours in the test system
  • D. The data is only stored in volatile storage that expires on power loss


Answer : D

A large bank was planning to offload existing data from a data warehouse into Hadoop and use SQL queries to access historical data. Which one of the following statements is true for using HiveQL?

  • A. It supports four logical operators in query predicates: IN, NOT IN, EXISTS, and NOT EXISTS
  • B. It does not support nested sub-queries
  • C. Hive supports all ANSI SQL 2011 syntax
  • D. All of the above


Answer : A

Reference: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_data-access/content/hive-013-feature-subqueries-in-where-clauses.html

Which of the following is TRUE about a Resilient Distributed Dataset?

  • A. It is always mutable
  • B. It is always immutable
  • C. It can be configured to be either mutable or immutable
  • D. It can be changed from mutable to immutable state during its life cycle


Answer : B

Reference: http://beyondcorner.com/learn-apache-spark/spark-rdd-resilient-distributed-datasets/

When you configure a MapReduce job, the inputs can include:

  • A. A single file
  • B. Paths to one or more directories
  • C. A file pattern (e.g., mypath/*.csv)
  • D. All of the above


Answer : C

How are insights derived from Big Match moved to an MDM system?

  • A. Extract insights from HBase and load into MDM through an API call
  • B. Extract insights from Hive and load into MDM using standard tooling
  • C. Extract insights from HDFS and load into MDM by stimulating delta load
  • D. Extract insights from HBase and load into MDM using standard MDM batch processing tool


Answer : C

Which of the following Pig Latin expressions is used to sum a set of numbers in a bag?

  • A. X = FOR EACH C -> (group,SUM(A.a1))
  • B. X = FOR EACH C GEN [group, SUM (A.a1)];
  • C. X = FOREACH C GENERATE group, SUM (A.a1);
  • D. X = FOREACH C GENERATE FLATTEN SUM (A.a1);


Answer : C

Reference: https://github.com/rjurney/Cloud-Stenography/blob/master/CloudStenography/pig-0.3.0/src/docs/src/documentation/content/xdocs/piglatin.xml

Which of the following statements about MapReduce is true?

  • A. MapReduce source programs must be written in Java
  • B. The output from MapReduce is one or more files stored in the DFS
  • C. MapReduce programs always have four phases: Mapper, Shuffle, Combiner, and Reducer
  • D. Intermediate files, sent from Map tasks to Reduce tasks, are replicated with the number of copies equal to the number of Reducers


Answer : B

Which BigInsights tool is able to export data from Big SQL?

  • A. Ambari
  • B. BigSheets
  • C. Flume
  • D. Knox


Answer : B

Reference: https://www.ibm.com/support/knowledgecenter/en/SSCRJT_5.0.1/com.ibm.swg.im.bigsql.tut.doc/doc/less_bsql_exptobigsh.html

Consider the following query:
curl "http://localhost:8983/sorl/gettingstarted/select?wt=json&indent=true&q=foundation&fl=id
What is the restricted field?

  • A. id
  • B. json
  • C. indent
  • D. foundation


Answer : D

Reference: https://lucene.apache.org/solr/6_3_0/quickstart.html

Which tool below can be used for extracting data directly from an RDBMS and placing a copy within BigInsights as a ready-to-query table?

  • A. Flume
  • B. Sqoop
  • C. NZ Load
  • D. Distributed Copy


Answer : B

Reference: https://www.ibm.com/support/knowledgecenter/da/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.import.doc/doc/data_warehouse_sqoop.html

Which format would be best for holding semi-structured data?

  • A. Text
  • B. Avro
  • C. CSV
  • D. JSON


Answer : D

What does the acronym "PCI" stand for in the phrase "PCI compliant"?

  • A. Payment Card Industry
  • B. Personal Credit & Income
  • C. Premium Credit Inspection
  • D. Proactive Controls Implementation


Answer : A

Reference: https://www.onr.com/education-documentation/what-is-pci-compliance/

Page:    1 / 4   
Exam contains 53 questions

Talk to us!


Have any questions or issues ? Please dont hesitate to contact us

Certlibrary doesn't offer Real Microsoft Exam Questions.
Certlibrary Materials do not contain actual questions and answers from Cisco's Certification Exams.
CFA Institute does not endorse, promote or warrant the accuracy or quality of Certlibrary. CFA® and Chartered Financial Analyst® are registered trademarks owned by CFA Institute.