Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Microsoft Certified Azure Data Fundamentals (DP-900)MICROSOFT AZURE FUNDAMENTALS, Exams of Programming Languages

DP-900: MICROSOFT AZURE FUNDAMENTALS What is Data and why Data is a very important asset? - correct answer Data is a collection of facts such as numbers, descriptions, and observations used in decision making. In this competitive market, data is a valuable asset, and when analyzed properly can turn into a wealth of useful information and inform critical business decisions. How many ways you can classify the data? - correct answer Structured Semi-structured Unstructured

Typology: Exams

2023/2024

Available from 03/24/2024

star_score_grades
star_score_grades 🇺🇸

3.6

(19)

1.7K documents

1 / 52

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
DP-900: MICROSOFT AZURE
FUNDAMENTALS
What is Data and why Data is a very important asset? - correct answer Data is a
collection of facts such as numbers, descriptions, and observations used in
decision making. In this competitive market, data is a valuable asset, and when
analyzed properly can turn into a wealth of useful information and inform critical
business decisions.
How many ways you can classify the data? - correct answer Structured
Semi-structured
Unstructured
What is Semi-structured Data? - correct answer Semi-structured data is
information that doesn't reside in a relational database but still has some
structure to it. Examples include documents held in JavaScript Object Notation
(JSON) format.There are other types of semi-structured data as well.
Examples include key-value stores and graph databases.
A key-value store is similar to a relational table, except that each row can have
any number of columns.You can use a graph database to store and query
information about complex relationships.
A graph contains nodes (information about objects), and edges (information
about the relationships between objects).
What is the Unstructured Data? - correct answer Not all data is structured or even
semi-structured. For example, audio and video files, and binary data files might
not have a specific structure. They're referred to as unstructured data.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34

Partial preview of the text

Download Microsoft Certified Azure Data Fundamentals (DP-900)MICROSOFT AZURE FUNDAMENTALS and more Exams Programming Languages in PDF only on Docsity!

DP-900: MICROSOFT AZURE

FUNDAMENTALS

What is Data and why Data is a very important asset? - correct answer Data is a collection of facts such as numbers, descriptions, and observations used in decision making. In this competitive market, data is a valuable asset, and when analyzed properly can turn into a wealth of useful information and inform critical business decisions. How many ways you can classify the data? - correct answer Structured Semi-structured Unstructured What is Semi-structured Data? - correct answer Semi-structured data is information that doesn't reside in a relational database but still has some structure to it. Examples include documents held in JavaScript Object Notation (JSON) format.There are other types of semi-structured data as well. Examples include key-value stores and graph databases. A key-value store is similar to a relational table, except that each row can have any number of columns.You can use a graph database to store and query information about complex relationships. A graph contains nodes (information about objects), and edges (information about the relationships between objects). What is the Unstructured Data? - correct answer Not all data is structured or even semi-structured. For example, audio and video files, and binary data files might not have a specific structure. They're referred to as unstructured data.

Azure provides different types of storage services based on the type of data. Is this true? - correct answer True -Depending on the type of data such as structured, semi-structured, or unstructured, data will be stored differently. Structured data is typically stored in a relational database such as SQL Server or Azure SQL Database. If you want to store unstructured data such as video or audio files, you can use Azure Blob storage If you want to store semi-structured data such as documents, you can use a service such as Azure Cosmos DB. What is called Provisioning? - correct answer The act of setting up the database server is called provisioning. You can define several levels of access to your data in Azure. Is this true? - correct answer True Read-only access means the users can read data but can't modify any existing data or create new data. Read/write access gives users the ability to view and modify existing data. Owner privilege gives full access to the data including managing the security like adding new users and removing access to existing users. You can also define which users should be allowed to access the data in the first place. What are the two kinds of Data processing solutions? - correct answer transactional system (OLTP)

The raw data might not be in a format that is suitable for querying. The data might contain anomalies that should be filtered out, or it may require transforming in some way. For example, dates or addresses might need to be converted into a standard format. After data is ingested into a data repository, you may want to do some cleaning operations and remove any questionable or invalid data, or perform some aggregations such as calculating profit, margin, and other Key Performance Metrics (KPIs). KPIs are how businesses are measured for growth and performance. Data Querying: After data is ingested and transformed, you can query the data to analyze it. You may be looking for trends, or attempting to determine the cause of problems in your systems. Many database management systems provide tools to enable you to perform ad-hoc queries against your data and generate regular reports. Data Visualization: Data represented in tables such as rows and columns, or as documents, aren't always intuitive. Visualizing the data can often be useful as a tool for examining data. You can generate charts such as bar charts, line charts, plot results on geographical maps, pie charts, or illustrate how data changes over time. Microsoft offers visualization tools like Power BI to provide rich graphical representation of your data. What is called normalization? - correct answer The Process of splitting into a large number of narrow, well-defined tables (a narrow table is a table with few columns), with references from one table to another, as shown in the image below. However, querying the data often requires reassembling information from multiple tables by joining the data back together at run-time.

You have a lot of customer data and you have decided to store this data in the relational database. What is the first thing you should do? - correct answer normalization What are the drawbacks of normalization? - correct answer You split the information into tables. When you read this info you need to essemble this information at runtime by joins. These queries might be expensive sometimes. Non-relational databases enable you to store data in a format that more closely matches the original structure. What is the disadvantage of this? - correct answer Some of the data is duplicated in the documented database. This duplication not only increases the storage required, but can also make maintenance more complex(you have to modify everywhere)

. What are ACID principles? - correct answer Atomicity guarantees that each transaction is treated as a single unit, which either succeeds completely, or fails completely. If any of the statements constituting a transaction fails to complete, the entire transaction fails and the database is left unchanged. An atomic system must guarantee atomicity in each and every situation, including power failures, errors, and crashes. Consistency ensures that a transaction can only take the data in the database from one valid state to another. A consistent database should never lose or create data in a manner that can't be accounted for. In the bank transfer example described earlier, if you add funds to an account, there must be a corresponding deduction of funds somewhere, or a record that describes where the funds have come from if they have been received externally. You can't suddenly create (or lose) money. Isolation

data. Eventual consistency is ideal where the application doesn't require any ordering guarantees What is Data processing and how many kinds? - correct answer Data processing is simply the conversion of raw data to meaningful information through a process. Processing data as it arrives is called streaming. Buffering and processing the data in groups is called batch processing. What are the advantages and disadvantages of batch processing? - correct answer Advantages:

  • Large volumes of data can be processed at a convenient time.
  • It can be scheduled to run at a time when computers or systems might otherwise be idle, such as overnight, or during off-peak hours. Disadvantages:
  • The time delay between ingesting the data and getting the results.
  • All of a batch job's input data must be ready before a batch can be processed.Even minor data errors, such as typographical errors in dates, can prevent a batch job from running. A real-estate website that tracks a subset of data from consumers' mobile devices, and makes real-time property recommendations of properties to visit based on their geo-location. How do you process this data? - correct answer streaming What are the other differences between streaming and batch processing of data?
  • correct answer Data Scope:

Batch data can process all the data in the dataset. Stream processing typically only has access to the most recent data received, or within a rolling time window (the last 30 seconds, for example). Data Size: Batch data is suitable for handling large datasets efficiently. Stream processing is intended for individual records or micro batches consisting of few records. Performance: The latency for batch processing is typically a few hours. Stream processing typically occurs immediately, with latency in the order of seconds or milliseconds. Latency is the time taken for the data to be received and processed. Analysis: You typically use batch processing for performing complex analytics. Stream processing is used for simple response functions, aggregates, or calculations such as rolling averages. How is data in a relational table organized? - correct answer Rows and Columns What is an example of unstructured data? - correct answer Audio and Video files What is an example of a streaming dataset? - correct answer Data from Twitter feeds What are the roles in the world of data? - correct answer Azure Database Administrator role

What are some of the common tools that Data Analyst uses? - correct answer Power BI What are some of the common tools that Data engineer uses? - correct answer sqlcmd utility Azure Databricks Azure HDInsight Name one of the following tasks is the role of a database administrator? - correct answer restoring and backup What are the characteristics of relational data? - correct answer All data is tabular. Entities are modeled as tables, each instance of an entity is a row in the table, and each property is defined as a column. All rows in the same table have the same set of columns. A table can contain any number of rows. A primary key uniquely identifies each row in a table. No two rows can share the same primary key. A foreign key references rows in another, related table. For each value in the foreign key column, there should be a row with the same value in the corresponding primary key column in the other table What is the primary key and foreign key? - correct answer The primary key indicates the column (or combination of columns) that uniquely identify each row. Every table should have a primary key.

The columns marked FK are Foreign Key columns. They reference, or link to, the primary key of another table, and are used to maintain the relationships between tables. A foreign key also helps to identify and prevent anomalies, such as orders for customers that don't exist in the Customers table. How do you query the relational data? - correct answer Most relational databases support Structured Query Language (SQL). You use SQL to create tables, insert, update, and delete rows in tables, and to query data. Give an example of SQL? - correct answer SELECT CustomerID, CustomerName, CustomerAddressFROM Customers Why do use JOINS in SQL queries? - correct answer You can combine the data from multiple tables in a query using a join operation. A join operation spans the relationships between tables, enabling you to retrieve the data from more than one table at a time. The following query retrieves the name of every customer, together with the product name and quantity for every order they've placed. Notice that each column is qualified with the table it belongs to: SELECT Customers.CustomerName, Orders.QuantityOrdered, Products.ProductName FROM Customers JOIN Orders ON Customers.CustomerID = Orders.CustomerID JOIN Products ON Orders.ProductID = Products.ProductID What are the most common use cases of relational databases? - correct answer Examples of OLTP applications that use relational databases are banking solutions, online retail applications, flight reservation systems, and many online purchasing applications. What is an index? - correct answer When you create an index in a database, you specify a column from the table, and the index contains a copy of this data in a sorted order, with pointers to the corresponding rows in the table. When the user

small configuration changes (changes in network addresses, for example) to take account of the change in environment. What is Paas and when should you use it? - correct answer PaaS stands for Platform-as-a-service. Rather than creating a virtual infrastructure, and installing and managing the database software yourself, a PaaS solution does this for you. You specify the resources that you require (based on how large you think your databases will be, the number of users, and the performance you require), and Azure automatically creates the necessary virtual machines, networks, and other devices for you. What is the benefit of using a PaaS service, instead of an on-premises system, to run your database management systems? - correct answer Increased scalability PaaS solutions enable you to scale up and out without having to procure your own hardware. What are the key characteristics of non-relational data? - correct answer A key aspect of non-relational databases is that they enable you to store data in a very flexible manner. Non-relational databases don't impose a schema on data. Instead, they focus on the data itself rather than how to structure it. This approach means that you can store information in a natural format, that mirrors the way in which you would consume, query and use it. Non-relational systems such as Azure Cosmos DB (a non-relational database management system available in Azure), support indexing even when the structure of the indexed data can vary from record to record. Is this true? - correct answer True What are the use cases of the non-relational databases? - correct answer IoT and telematics:

These systems typically ingest large amounts of data in frequent bursts of activity. Non-relational databases can store this information very quickly. The data can then be used by analytics services such as Azure Machine Learning, Azure HDInsight, and Microsoft Power BI. Additionally, you can process the data in real-time using Azure Functions that are triggered as data arrives in the database. Retail and marketing: Microsoft uses CosmosDB for its own ecommerce platforms that run as part of Windows Store and XBox Live. It's also used in the retail industry for storing catalog data and for event sourcing in order processing pipelines. Gaming: The database tier is a crucial component of gaming applications. Modern games perform graphical processing on mobile/console clients, but rely on the cloud to deliver customized and personalized content like in-game stats, social media integration, and high-score leaderboards. Games often require single-millisecond latencies for reads and write to provide an engaging in-game experience. A game database needs to be fast and be able to handle massive spikes in request rates during new game launches and feature updates. Web and mobile applications: A non-relational database such as Azure Cosmos DB is commonly used within web and mobile applications, and is well suited for modeling social interactions, integrating with third-party services, and for building rich personalized experiences. The Cosmos DB SDKs (software development kits) can be used build rich iOS and Android applications using the popular Xamarin framework. What are the formats of semi-structured data? - correct answer JSON

What are the NoSQL databases? - correct answer NoSQL (non-relational) databases generally fall into four categories: key-value stores, document databases, column family databases, and graph databases. key-value store A key-value store is the simplest (and often quickest) type of NoSQL database for inserting and querying data. Each data item in a key-value store has two elements, a key and a value. The key uniquely identifies the item, and the value holds the data for the item. The value is opaque to the database management system. Items are stored in key order. document database A document database represents the opposite end of the NoSQL spectrum from a key-value store. In a document database, each document has a unique ID, but the fields in the documents are transparent to the database management system. Document databases typically store data in JSON format. they could be encoded using other formats such XML, YAML, JSON, BSON column family database A column family database organizes data into rows and columns. Examples of this structure include ORC and Parquet filesIn its simplest form, a column family database can appear very similar to a relational database, at least conceptually. The real power of a column family database lies in its denormalized approach to structuring sparse data. graph database

Graph databases enable you to store entities, but the main focus is on the relationships that these entities have with each other. A graph database stores two types of information: nodes that you can think of as instances of entities, and edges, which specify the relationships between nodes. Nodes and edges can both have properties that provide information about that node or edge (like columns in a table). Additionally, edges can have a direction indicating the nature of the relationship.https://docs.microsoft.com/en-us/learn/modules/explore What are the characteristics of the Key-value store? - correct answer * A query specifies the keys to identify the items to be retrieved.

  • You can't search on values. An application that retrieves data from a key-value store is responsible for parsing the contents of the values returned.
  • The value is opaque to the database management system.
  • Write operations are restricted to inserts and deletes.
  • If you need to update an item, you must retrieve the item, modify it in memory (in the application), and then write it back to the database, overwriting the original (effectively a delete and an insert). What is the use case for the Key-value store? - correct answer The focus of a key- value store is the ability to read and write data very quickly. Search capabilities are secondary. A key-value store is an excellent choice for data ingestion, when a large volume of data arrives as a continual stream and must be stored immediately. You are building a system that monitors the temperature throughout a set of office blocks and sets the air conditioning in each room in each block to maintain a pleasant ambient temperature. Your system has to manage the air conditioning in several thousand buildings spread across the country or region, and each building typically contains at least 100 air-conditioned rooms. What type of NoSQL datastore is most appropriate for capturing the temperature data to enable it to be processed quickly? - correct answer A key-value store

transform, and load steps can be performed as a continuous pipeline of operations. It is suitable for systems that only require simple models, with little dependency between items. ELT is an abbreviation of Extract, Load, and Transform. The process differs from ETL in that the data is stored before being transformed. The data processing engine can take an iterative approach, retrieving and processing the data from storage, before writing the transformed data and models back to storage. ELT is more suitable for constructing complex models that depend on multiple items in the database, often using periodic batch processing. What is Reporting? - correct answer Reporting is the process of organizing data into informational summaries to monitor how different areas of an organization are performing. Reporting helps companies monitor their online business, and know when data falls outside of expected ranges. Good reporting should raise questions about the business from its end users. Reporting shows you what has happened, while analysis focuses on explaining why it happened and what you can do about it. What is Business Intelligence? - correct answer The term Business Intelligence (BI) refers to technologies, applications, and practices for the collection, integration, analysis, and presentation of business information. The purpose of business intelligence is to support better decision making. What is Data Visualization? - correct answer Data visualization is the graphical representation of information and data. By using visual elements like charts, graphs, and maps, data visualization tools provide an accessible way to spot and understand trends, outliers, and patterns in data. What are the most common forms of visualizations? - correct answer Bar and column charts:

Bar and column charts enable you to see how a set of variables changes across different categories. Line charts: Line charts emphasize the overall shape of an entire series of values, usually over time. Matrix: A matrix visual is a tabular structure that summarizes data. Often, report designers include matrixes in reports and dashboards to allow users to select one or more element (rows, columns, cells) in the matrix to cross-highlight other visuals on a report page. Key influencers: A key influencer chart displays the major contributors to a selected result or value. Key influencers are a great choice to help you understand the factors that influence a key metric . Treemap: Treemaps are charts of colored rectangles, with size representing the relative value of each item. They can be hierarchical, with rectangles nested within the main rectangles. Scatter: A scatter chart shows the relationship between two numerical values. A bubble chart is a scatter chart that replaces data points with bubbles, with the bubble size representing an additional third data dimension.