IBM Big Data Architect v7.0

Page:    1 / 8   
Exam contains 110 questions

Company A is searching for a browser-based visualization tool to perform analysis on vast amounts of data in any structure. They want to execute operations such as pivot, slice and dice, among others. Which of the following would meet these requirements?

  • A. Streams
  • B. BigSheets
  • C. Aginity Workbench
  • D. Watson Explorer


Answer : B

Explanation:
References:
http://www.dotgroup.co.uk/wp-content/uploads/2014/11/Harness-the-Power-of-Big-Data-
The-IBM-Big-DataPlatform.pdf Page: 132

An upstream Oil and Gas Producer needs to optimize the performance of its assets. It needs to calculate Key Performance Indicators for flow rate sensors deployed to monitor the output flow and temperature and pressure for multiple pipelines in an oil field with hundreds of wells. Which of the following would you recommend to meet these requirements?

  • A. Datastage ETL jobs should be created
  • B. Infosphere Streams should be used
  • C. Cognos reports would suffice here
  • D. Hadoop based datastorage engine should be used


Answer : B

The downside of cloud computing, relative to SLAs, is the difficulty in determining which of the following?

  • A. Root cause for service interruptions
  • B. Turn-Around-Time (TAT)
  • C. Mean Time To Recover (MTTR)
  • D. First Call Resolution (FCR)


Answer : A

Explanation:
References: https://en.wikipedia.org/wiki/Service-level_agreement

The inputs to the Architectural Overview document do NOT include which of the following?

  • A. Architectural Goals
  • B. Key Concepts
  • C. Architectural Overview Diagram
  • D. Component Model


Answer : D

Its helpful to look at the characteristics of big data along certain lines for example, how the data is collected, analyzed and processed. There are many characteristics to consider.
Which one of the following is NOT a characteristic that should be considered?

  • A. Data frequency and size
  • B. Software
  • C. Data source
  • D. Processing methodology


Answer : B

Explanation:
References: http://www.ibm.com/developerworks/library/bd-archpatterns1/

By default, Parquet uses which of the following codecs?

  • A. SNAPPY
  • B. LZO
  • C. GZIP
  • D. BZIP2


Answer : A

Data Architects assess data according to several common characteristics. One of these characteristics is the type of data. Which of the following are data types that fall into this ontology? (Choose two.)

  • A. Master data
  • B. Meta data
  • C. Boolean
  • D. Composite
  • E. Integer


Answer : A,D

Which of the following requirements would NOT be effectively addressed by a NoSQL data store?

  • A. Scalability
  • B. Reporting
  • C. Sparse data
  • D. Batch processing


Answer : D

An IBM Big Data platform is well suited to deal which of the following kinds of data types?

  • A. Structured data in row format only
  • B. Semi-structured and unstructured data only
  • C. Text data, sensor data, and audio data only
  • D. Semi-structured, unstructured, and structured data


Answer : D

Reference:
https://www-304.ibm.com/industries/publicsector/fileserve?contentid=239170

A reputable market research firm wants to explore more business opportunities. They have great in house skill in python and machine learning. Their business model is simple, they build the solutions for customers using python and machine learning algorithms and give these solutions to the customers engineering team for implementation. Given this scenario, which of the following would you recommend?

  • A. Netezza
  • B. Spark
  • C. Cloudant
  • D. Hadoop


Answer : D

A large global enterprise customer has a Big Data environment set up on Hadoop. After a year in operation they are now looking to extend access to multiple functions that will need different views into different aspects/portions of the data. As you consider these requirements, which of the following statements is TRUE and also applies to the scenario?

  • A. Hadoop does not support multi tenancy but can easily scale to support this by replicating data to new clusters with commodity hardware.
  • B. Hadoop can support multi tenancy but only if YARN is used, so if not already used, the customer will need to upgrade to a YARN supported version.
  • C. Virtualization methods can be used to support multi-tenancy on Hadoop because one can easily replicate to additional cluster whenever there is a need.
  • D. Hadoop can support multi tenancy by using a cluster file system for storage, allowing all nodes to access the data.


Answer : D

Reference:
http://www-
01.ibm.com/support/knowledgecenter/STXKQY/411/com.ibm.spectrum.scale.v4r11.adv.do c/bl1adv_hadoop.htm?lang=en

A large application vendor wants to port their existing distributed applications to run on
Hadoop. In order to be competitive they need to provide monitoring and keep the size of the monitored applications consistent with the configuration. This implies the ability to deploy a replacement, for example, for any failed components. Which of the following would be a workable solution?

  • A. Nagios with YARN
  • B. Slider with YARN
  • C. Oozie with Lucene
  • D. OPTIM Performance Manager


Answer : B

A news organization wants to analyze all the news stories coming in real time and make them available to their users based on the users interest. Given this requirement, which of the following would you recommend?

  • A. Spark
  • B. Hadoop
  • C. Netezza
  • D. Cloudant


Answer : A

Company K is designing their Big Data system. In their enterprise, they anticipate every 9 months there will be a big spike of new data on the order of multiple TB. Their company policy also dictates that data older than one year will be archived with a major clean up every 5 years. Cost is also a big issue. Which of the following provides the best design for these requirements?

  • A. Estimate the peak volume over a 5 year period and set up a Hadoop system with commodity HW andstorage to accommodate that volume
  • B. Estimate the peak volume over a 3 year period and set up a Hadoop system with NAS to accommodate theexpected volume
  • C. Use Cloud elasticity capabilities to handle the peak and valley data volume
  • D. Use SAN storage with compression to handle the peak and valley data volume


Answer : A

The AQL query language is the easiest and most flexible tool to pull structured output from which of the following?

  • A. Hive data structures
  • B. Unstructured text
  • C. Hbase schemas
  • D. JDBC connected relational data marts


Answer : A

Reference:
http://www.ibm.com/developerworks/library/bd-sqltohadoop2/

Page:    1 / 8   
Exam contains 110 questions

Talk to us!


Have any questions or issues ? Please dont hesitate to contact us

Certlibrary doesn't offer Real Microsoft Exam Questions.
Certlibrary Materials do not contain actual questions and answers from Cisco's Certification Exams.
CFA Institute does not endorse, promote or warrant the accuracy or quality of Certlibrary. CFA® and Chartered Financial Analyst® are registered trademarks owned by CFA Institute.