Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Microsoft Certified Azure Data Fundamentals (DP-900), Exams of Programming Languages

DP-900 What three main types of workload can be found in a typical modern data warehouse? - correct answer - Streaming Data - Batch Data - Relational Data A ____________________ is a continuous flow of information, where continuous does not necessarily mean regular or constant. - correct answer data stream __________________________ focuses on moving and transforming data at rest. - correct answer Batch processing This data is usually well organized and easy to understand. Data stored in relational databases is an example, where table rows and columns represent entities and their attributes. - correct answer Structured Data This data usually does not come from relational stores, since even if it could have some sort of internal organization, it is not mandatory. Good examples are XML and JSON files. - correct answer Semi-structured Data

Typology: Exams

2023/2024

Available from 03/24/2024

star_score_grades
star_score_grades 🇺🇸

3.6

(19)

1.7K documents

1 / 58

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
DP-900
What three main types of workload can be found in a typical modern data
warehouse? - correct answer - Streaming Data
- Batch Data
- Relational Data
A ____________________ is a continuous flow of information, where continuous
does not necessarily mean regular or constant. - correct answer data stream
__________________________ focuses on moving and transforming data at rest. -
correct answer Batch processing
This data is usually well organized and easy to understand. Data stored in
relational databases is an example, where table rows and columns represent
entities and their attributes. - correct answer Structured Data
This data usually does not come from relational stores, since even if it could have
some sort of internal organization, it is not mandatory. Good examples are XML
and JSON files. - correct answer Semi-structured Data
Data with no explicit data model falls in this category. Good examples include
binary file formats (such as PDF, Word, MP3, and MP4), emails, and tweets. -
correct answer Unstructured Data
What type of analysis answers the question "What happened?" - correct answer
Descriptive Analysis
What type of analysis answers the question "Why did it happen?" - correct
answer Diagnostic Analysis
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a

Partial preview of the text

Download Microsoft Certified Azure Data Fundamentals (DP-900) and more Exams Programming Languages in PDF only on Docsity!

DP-

What three main types of workload can be found in a typical modern data warehouse? - correct answer - Streaming Data

  • Batch Data
  • Relational Data A ____________________ is a continuous flow of information, where continuous does not necessarily mean regular or constant. - correct answer data stream __________________________ focuses on moving and transforming data at rest. - correct answer Batch processing This data is usually well organized and easy to understand. Data stored in relational databases is an example, where table rows and columns represent entities and their attributes. - correct answer Structured Data This data usually does not come from relational stores, since even if it could have some sort of internal organization, it is not mandatory. Good examples are XML and JSON files. - correct answer Semi-structured Data Data with no explicit data model falls in this category. Good examples include binary file formats (such as PDF, Word, MP3, and MP4), emails, and tweets. - correct answer Unstructured Data What type of analysis answers the question "What happened?" - correct answer Descriptive Analysis What type of analysis answers the question "Why did it happen?" - correct answer Diagnostic Analysis

What type of analysis answers the question "What will happen?" - correct answer Predictive Analysis What type of analysis answers the question "How can we make it happen?" - correct answer Prescriptive Analysis The two main kinds of workloads are ______________ and _________________. - correct answer extract-transform-load (ETL) extract-load-transform (ELT) ______ is a traditional approach and has established best practices. It is more commonly found in on-premises environments since it was around before cloud platforms. It is a process that involves a lot o data movement, which is something you want to avoid on the cloud if possible due to its resource-intensive nature. - correct answer ETL ________ seems similar to ETL at first glance but is better suited to big data scenarios since it leverages the scalability and flexibility of MPP engines like Azure Synapse Analytics, Azure Databricks, or Azure HDInsight. - correct answer ELT _______________ is a cloud service that lets you implement, manage, and monitor a cluster for Hadoop, Spark, HBase, Kafka, Store, Hive LLAP, and ML Service in an easy and effective way. - correct answer Azure HDInsight _____________ is a cloud service from the creators of Apache Spark, combined with a great integration with the Azure platform. - correct answer Azure Databricks ____________ is the new name for Azure SQL Data Warehouse, but it extends it in many ways. It aims to be the comprehensive analytics platform, from data ingestion to presentation, bringing together one-click data exploration, robust pipelines, enterprise-grade database service, and report authoring. - correct answer Azure Synapse Analytics

2 - B

Extract, Transform, Load (ETL) is the correct approach when you need to filter sensitive data before loading the data into an analytical model. It is suitable for simple data models that do not require Azure Data Lake support. Extract, Load, Transform (ELT) is the correct approach because it supports Azure Data Lake as the data store and manages large volumes of data. The technique that provides recommended actions that you should take to achieve a goal or target is called _____________ analytics. A. descriptive B. diagnostic C. predictive D. prescriptive - correct answer D. prescriptive A. Tables B. Indexes C. Views D. Keys

  1. Create relationships.
  2. Improve processing speed for data searches.
  3. Store instances of entities as rows.
  4. Display data from predefined queries. - correct answer 1 - D 2 - B 3 - A 4 - C

The process of splitting an entity into more than one table to reduce data redundancy is called: _____________. A. deduplication B. denormalization C. normalization D. optimization - correct answer C. normalization Azure SQL Database is an example of ________________ -as-a-service. A. platform B. infrastructure C. software D. application - correct answer A. platform A. Azure Data Studio B. Azure Query editor C. SQL Server Data Tools

  1. Query data while working within a Visual Studio project.
  2. Query data located in a non-Microsoft platform.
  3. Query data from within the Azure portal - correct answer 1 - C 2 - A 3 - B The act of increasing or decreasing the resources that are available for a service is called: _____________. A. computing

You have data that consists of JSON-based documents. You need to store the data in an Azure environment that supports efficient non-key, field-based searching. You should use _______________________ as the data store. A. Azure Table Storage B. Azure Blob Storage C. Azure File Storage D. Azure Cosmos DB - correct answer D. Azure Cosmos DB You need to create a graph database. Which Azure data service should you use? A. Azure Table B. Azure Cosmos DB C. Azure Blob D. Azure File - correct answer B. Azure Cosmos DB Only Azure Cosmos DB supports creating graph databases. Azure Table Storage, Azure Blob Storage, and Azure File Storage do not support graph databases. You use Azure Table Storage as a non-relational data store. You need to optimize data retrieval. You should use ______________________________ as query criteria. A. only partition keys B. only row keys C. partition keys and row keys D. only properties - correct answer C. partition keys and row keys You need to use JavaScript Object Notation (JSON) files to provision Azure storage. What should you use?

A. Azure portal B. Azure command-line interface (CLI) C. Azure PowerShell D. Azure Resource Manager (ARM) templates - correct answer D. Azure Resource Manager (ARM) templates For which reason should you deploy a data warehouse? A. Record daily sales transactions. B. Perform sales trend analyses. C. Print sales orders. D. Search status of sales orders. - correct answer B. Perform sales trend analyses. Which two Azure data services support Apache Spark clusters? Each correct answer presents a complete solution. A. Azure Synapse Analytics B. Azure Cosmos DB C. Azure Databricks D. Azure Data Factory - correct answer A. Azure Synapse Analytics C. Azure Databricks You design a data ingestion and transformation solution by using Azure Data Factory service. You need to get data from an Azure SQL database. Which two resources should you use? Each correct answer presents part of the solution.

All records print if you design the report by using Report Builder as a paginated report, all records print. Only paginated report supports repeatable headers and footers. You cannot create paginated reports by using Power BI visuals. You must use Report Builder instead. How is data in a relational table organized? Rows and Columns Header and Footer Pages and Paragraphs - correct answer Rows and Columns Which of the following is an example of unstructured data? An Employee table with columns EmployeeID, EmployeeName, and EmployeeDesignation Audio and Video files A table within a relational database - correct answer Audio and Video files What is a data warehouse? A non-relational database optimized for read and write operations A relational database optimized for read operations A storage location for unstructured data files - correct answer A relational database optimized for read operations A ____________________ is responsible for the design, implementation, maintenance, and operational aspects of on-premises and cloud-based database systems. They're responsible for the overall availability and consistent

performance and optimizations of databases. - correct answer database administrator A _____________________ collaborates with stakeholders to design and implement data-related workloads, including data ingestion pipelines, cleansing and transformation activities, and data stores for analytical workloads. They use a wide range of data platform technologies, including relational and non- relational databases, file stores, and data streams. - correct answer data engineer A ________________________ enables businesses to maximize the value of their data assets. They're responsible for exploring data to identify trends and relationships, designing and building analytical models, and enabling advanced analytics capabilities through reports and visualizations. - correct answer data analyst A. Azure SQL Database B. Azure SQL Managed Instance C. Azure SQL VM 1 - a virtual machine with an installation of SQL Server, allowing maximum configurability with full management responsibility. 2 - a hosted instance of SQL Server with automated maintenance, which allows more flexible configuration than Azure SQL DB but with more administrative responsibility for the owner. 3 - a fully managed platform-as-a-service (PaaS) database hosted in Azure - correct answer 1 - C 2 - B 3 - A A. Azure Database for MySQL B. Azure Database for MariaDB C. Azure Database for PostgreSQL

Data engineers can use ________________ to create a unified data analytics solution that combines data ingestion pipelines, data warehouse storage, and data lake storage through a single service. - correct answer Azure Synapse Analytics Azure ________________ is an Azure-integrated version of the popular __________________ platform, which combines the Apache Spark data processing platform with SQL database semantics and an integrated management interface to enable large-scale data analytics. - correct answer Databricks ____________________ is a real-time stream processing engine that captures a stream of data from an input, applies a query to extract and manipulate data from the input stream, and writes the results to an output for analysis or further processing. - correct answer Azure Stream Analytics Azure _________________ is a standalone service that offers the same high- performance querying of log and telemetry data as the Azure Synapse ________________ runtime in Azure Synapse Analytics. - correct answer Data Explorer ____________________ provides a solution for enterprise-wide data governance and discoverability. You can use __________________ to create a map of your data and track data lineage across multiple data sources and systems, enabling you to find trustworthy data for analysis and reporting. - correct answer Azure Purview Microsoft __________________ is a platform for analytical data modeling and reporting that data analysts can use to create and share interactive data visualizations. - correct answer Power BI Which one of the following tasks is the responsibility of a database administrator?

Backing up and restoring databases Creating dashboards and reports Creating pipelines to process data in a data lake - correct answer Backing up and restoring databases Which role is most likely to use Azure Data Factory to define a data pipeline for an ETL process? Database Administrator Data Engineer Data Analyst - correct answer Data Engineer Which single service would you use to implement data pipelines, SQL analytics, and Spark analytics? Azure SQL Database Microsoft Power BI Azure Synapse Analytics - correct answer Azure Synapse Analytics In a relational database, you model collections of entities from the real world as ______________. - correct answer tables _____________ is a term used by database professionals for a schema design process that minimizes data duplication and enforces data integrity. - correct answer Normalization SQL statements are grouped into what three main logical groups? - correct answer Data Definition Language (DDL) Data Control Language (DCL) Data Manipulation Language (DML)

What is an index? A structure that enables queries to locate rows in a table quickly A virtual table based on the results of a query A pre-defined SQL statement that modifies data - correct answer A structure that enables queries to locate rows in a table quickly A SQL engine that is optimized for Internet-of-things (IoT) scenarios that need to work with streaming time-series data. - correct answer Azure SQL Edge Which deployment option offers the best compatibility when migrating an existing SQL Server on-premises solution? Azure SQL Database (single database) Azure SQL Database (elastic pool) Azure SQL Managed Instance - correct answer Azure SQL Managed Instance Which of the following statements is true about Azure SQL Database? Most database maintenance tasks are automated You must purchase a SQL Server license It can only support one database - correct answer Most database maintenance tasks are automated Which database service is the simplest option for migrating a LAMP application to Azure? Azure SQL Managed Instance Azure Database for MySQL

Azure Database for PostgreSQL - correct answer Azure Database for MySQL _________________ is the online Microsoft OLAP platform, which you can use to perform data analysis and manage huge volumes of information from different pointes of view. - correct answer Azure Synapse Analytics You are designing an application that will store petabytes of medical imaging data. When the data is first created, the data will be accessed frequently during the first week. After one month, the data must be accessible within 30 seconds, but files will be accessed infrequently. After one year, the data will be accessed infrequently but must be accessible within five minutes. You need to select a storage strategy for the data. The solution must minimize costs. Which storage tier should you use for each time frame? If you would like to access the data after first month, which storage strategy you will use? Archive Hot Cool - correct answer Cool First week: Hot -Hot - Optimized for storing data that is accessed frequently. After one month: Cool -Cool - Optimized for storing data that is infrequently accessed and stored for at least 30 days. You are designing an application that will store petabytes of medical imaging data.

Azure provides several solutions for working with CSV and JSON files, depending on your needs. The primary landing place for these files is either Azure Storage or Azure Data Lake Store. Azure Data Lake Storage is an optimized storage for big data analytics workloads. For folders its Data lake You are designing an Azure Cosmos DB database that will support vertices and edges. Which Cosmos DB API should you include in the design? SQL Cassandra Gremlin Table - correct answer Gremlin The Azure Cosmos DB Gremlin API can be used to store massive graphs with billions of vertices and edges. You are designing a data storage solution for a database that is expected to grow to 50 TB. The usage pattern is singleton inserts, singleton updates, and reporting. Which storage solution should you use? Azure SQL Database elastic pools Azure SQL Data Warehouse Azure Cosmos DB that uses the Gremlin API Azure SQL Database Hyperscale - correct answer Azure SQL Database Hyperscale A Hyperscale database is an Azure SQL database in the Hyperscale service tier that is backed by the Hyperscale scale-out storage technology. A Hyperscale database supports up to 100 TB of data and provides high throughput and performance, as well as rapid scaling to adapt to the workload requirements.

Scaling is transparent to the application connectivity, query processing, etc. work like any other Azure SQL database. A company plans to use Apache Spark Analytics to analyze intrusion detection data. You need to recommend a solution to monitor network and system activities for malicious activities and policy violations. Reports must be produced in an electronic format and sent to management. The solution must minimize administrative efforts. What should you recommend? Azure Data Factory Azure Data Lake Azure Databricks Azure HDInsight - correct answer Azure Databricks Recommendation engines, churn analysis, and intrusion detection are common scenarios that many organizations are solving across multiple industries. They require machine learning, streaming analytics, and utilize massive amounts of data processing that can be difficult to scale without the right tools. Companies like Lennox International, E.ON, and renewables.AI are just a few examples of organizations that have deployed Apache Spark™ to solve these challenges using Microsoft Azure Databricks. Which type of analytics helps answer questions about what has happened in the past? Descriptive analytics Prescriptive analytics Predictive analytics Cognitive analytics - correct answer Descriptive analytics Which of the following is based on column family database?