



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Discusses different data visualization tools and security techniques used with such data, specifically in healthcare.
Typology: Assignments
1 / 6
This page cannot be seen from the preview
Don't miss anything!
Assignment#7: (100 points) Objectives of this assignment: Demonstrate an understanding of the fundamentals of data visualization and practice communicating with data. Demonstrate the ability to perform analyses using specialized big data analytics tools, e.g. Tableau. Demonstrate techniques for maintaining data security and privacy for massive storage. Instructions to submit this assignment: 1) Download/install the Tableau tool “Tableau Public” free version from this link: https://public.tableau.com/en-us/s/download Answer all the following questions ( using your own words ) and upload the document in pdf format to the corresponding D2L Dropbox. Questions to be answered: (10 points) Q1: Discuss two popular big data visualization tools? Data visualization involves creating visual representation of a dataset to provide a better understanding. This is important as it can help individuals see patterns that may not have been identified otherwise. There are a number of ways in which data can be depicted some include pie charts, infographics, maps, and bar charts. Data visualization tools provide a platform where a user can input data and visually view it based on the different options provided. One of the more popular tools is Tableau. This is an easy-to-use software where data from numerous sources can be imported. It provides different tools such as clustering and calculations to be used on the data. It has an interactive interface and can convert data into “interactive graphics”. In addition to the provided features, users can create their own calculations and apply it to the dataset. This allows for customization based on the need of the user. Another popular data visualization tool is Plotly. This tool focuses on graphs. Data is imported and can then be translated visually onto different types of interactive charts and maps. A unique feature of this tool is that each graph is assigned its own URL. This allows for easy sharing and provides others with interactive information regarding how the graph was created. This is a useful benefit as other users can better understand the information being shared instead of having to filter through code (Costa 2022). (40 points) Q2: After reading the attached article, answer the following two questions: 2.1 Summarize two technologies that have been discussed in the article for protecting the security and privacy of healthcare data?
One technology/method discussed in this article for protecting the security of healthcare data is access control. This is a method that limits a user’s access in a network after they have been authenticated. It assigns privileges to each user based on permission they have received either by a patient or “trusted third party”. This will allow them to access protected health information. This helps to control what users can see, permitting them access only to the data they need and nothing more. It also helps to limit the number of users moving in a network (Abouelmehdi et al., 2017). Another method mentioned in the article is encryption. This is a very popular type of data protection in the healthcare industry where private information is encoded. This makes it unreadable and only those with the decryption key can read it. This limits the number of people who have access to the data and can help to reduce breaches. This is especially important because modern day healthcare commonly involves transmitting data. It can occur either between professionals or to the patients themselves. Therefore, the data should be encrypted during this transfer process to prevent hackers from gaining access. It is important to note that for this to be effective the technology should be efficient, easy to use, and easy to add new information. In addition, those with the decryption key should be limited (Abouelmehdi et al., 2017). 2.2 Summarize two methods that have been discussed in the article for privacy preserving in big data? One method mentioned in the article is de-identification. This process involves removing any information that can directly identify a patient. This can be done by removing specific identifiers that can trace back to a patient or by the patient verifying themselves enough that identifiers are deleted. These are not necessarily the best methods for protecting privacy, which has resulted in concepts like k-anonymity. This refers to a k value, in which the higher it is the less likely re-identification will occur. The issue with this is k-anonymization can lead to information loss. This has led to the k-anonymity extension, L-diversity. This protects datasets by “diminishing the granularity of data representation”. There are still some issues with this method as it “depends upon the range of sensitive attributes”. Inserting fictious data can cause problems during analysis which can lead to skewness. This led to T-closeness, an extension of l-diversity. It treats attributes distinctly and can intercept disclosure. The main issue with this, however, is that re-identification increases with the size and variety of the data (Abouelmehdi et al., 2017). Another method mentioned is HyberEx, hybrid execution model. This is specifically for protecting data in cloud computing. This model puts data into certain types of clouds based on its privacy. For example, information that is not
The next image shows the clustering. This image uses circles and squares to show which were clustered correctly and which were not.