

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Lab 3 detailed instructions and tips
Typology: Lab Reports
1 / 2
This page cannot be seen from the preview
Don't miss anything!
Assigned 6 / 5 /2 5 Due Prelab:^ NO PRELAB FOR THIS LAB Jupyter Lab 3 notebook must be uploaded by 11:59PM on 6 / 11 /2 5 Potential Points 100 Description Lab Part 1: Clustering on Synthetic Datasets You will be using k-means clustering to analyze a number of synthetic datasets. You will learn how to create a clustering model, how to use that model to predict which clusters new observation belong to, plot clusters with centroids and decision boundaries, and you will learn to optimize your clusters to improve results. You will need to copy the skeleton jupyter notebook from canvas into your google drive and open that notebook in Colab. Follow the procedure from Lab 2 for getting the notebook on drive and opening it with Colab. The notebook is: ECEN 250 _Lab3.ipynb For this part follow the detailed instructions in the notebook – modifying code cells, adding code cells, and entering information in text cells as directed in the notebook. Lab Part 2: Generating statistics for your clean blower data In part 2 of the Lab 3 notebook, you will load your CSV that you created at the end of Lab 2 which contains your cleaned blower data. If you did not complete Lab 2, that must be completed prior to Part 2 of Lab 3. In part 2, you will be generating statistics for features in your blower dataset, visualizing the features, using scatter plots to examine multiple features, and examining subsets of your blower dataset. For this part follow detailed instructions in the notebook – modifying code cells, adding code cells, and entering information in text cells as directed in the notebook. NOTE: If you failed to properly clean your dataset in Lab 2, portions of Part 2 may not run. If that is the case, fix those issues – either by redoing portions of Lab 3 and regenerating the clean data CSV, or by taking code from Lab 2 and adding it in at the start of Lab3 Part 2 to fix problems with your dataset. NOTE: your data likely differs significantly from the data used to create the skeleton notebook. Watch for cases where plot setting may need to be adjusted if your data has different ranges than the dataset used to create the skeleton. NOTE: If your blower data does not have a variety of values for every feature, then some of the statistics will not be very meaningful. For instance if your blower data only includes entries for blowers that include 1 battery in the price (i.e. you have no entries for zero battery, 2 battery,..) then statistics for the mean cost of the subset of blowers with 2 batteries will not be meaningful. Computing and presenting meaningless statistics will allow you to complete this lab, but will make Lab 4 exceptionally difficult and your results and grade will suffer. It is better to examine your dataset for Lab 3 – and if necessary add additional items to your list of blowers. Recall our lecture discussions on sampling plans – having samples that do not reflect the variety of values for features that we are recording will cause model issues. You may end up adding 5 ro 10 additional blowers to improve the quality of your dataset. Doing that in Lab 3 is better than waiting until Lab4! To add additional blowers to your dataset, you can either go
back to your source CSV for lab 2, add the new items to the 50 you had, and rerun the cells in the Lab 2 notebooks to create a new BlowerDataClean.csv file. [You may need to rename your old clean csv first