











Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
The course DSO 459: Business Analytics with Python offered at USC Marshall in Spring 2022. The course aims to develop essential skills for working in a business analytics division within a company, including programming in Python, data analysis, building predictive models, and presenting results to business stakeholders. The course includes 11 deliverables, including homework assignments, case studies, group presentations, and a take-home case study. The midterm exam is both open notes and open book and involves both written and computer portions.
Typology: Study notes
1 / 19
This page cannot be seen from the preview
Don't miss anything!
USC Marshall, Data Science and Operations
Spring 2022 - 4.0 Units
Instructor: Austin Pollok Time: T/Th 4-5:50pm E-mail: pollok@usc.edu Room: ACC 236
Course Description:
In today’s business environment data is ubiquitous. Many companies can thrive or fail by the quality of their data analysts, software developers, and technology or analytics managers. Albeit a rather tall order, it is imperative for these positions to be able to interface with one other to de- rive insights, produce products, and implement data-driven business decisions. Companies such as Amazon, Walmart, Google, American Express, and Disney perform analytics in operations, marketing, product development and management, and strategic planning. This course is about developing the essential skills, through a hands-on perspective, for working in a business analytics division within a company, regardless of the precise position. Such skills include programming, often in Python, answering business questions through data analysis, building predictive models, and presenting results to business stakeholders. The course is intended to develop students with little to no prior programming experience and prepare them to go through an end-to-end analytics life cycle in the context of various business fields. Such a life cycle starts with formulating business objectives and hypotheses, exploring data quantitatively and visually, building models, interpreting the results, and finally communicating actionable recommendations to business stakeholders that add value. This will be accomplished through student participation in end-to-end business analytics and machine learning case studies and projects in Python. Due to the wide-variety of necessary skills required to be a successful entity within an analytics division, the course can be challenging.
Course Communication Policy:
Please communicate! I would like to unblock students as early as possible, meaning I want to clear any questions, confusions, or concerns. We can meet in office hours, over Zoom, on Slack, or in-person, as well as grabbing a coffee for a quick chat. I am very interested in having discussions with all students, whether it be regarding course material, extensions of course materials, programming, or general career discussions in the field of analytics. I would also appreciate hearing any feedback regarding the course, as it is in a state of continual evolution. Please email me to set up any times for discussion, I will typically respond to all emails within two business days, though commonly quicker. Any content specific questions should also be posted on the course Slack channel where I will respond for any students who may be having similar trouble.
Course Grading Policy:
The homework weight is set on a sliding-scale. The reason for this is homework at the beginning is essential to complete the later assignments; it is also assumed to be brand-new to students, and therefore students will be expected to make mistakes, which is essential for learning. The midterm is a check-point to make sure students are familiar with basic Python and data wrangling in Pandas. The presentation will be an opportunity for students to explore a field of interest for a future career in business analytics. The final is meant to simulate a take-home interview screen commonly given to applicants in business analytics or data science.
Assignment Weight
Basic Python HW 1 & 2 2.5% each
Data Wrangling HW 3 & 4 2.5% each
Midterm 15%
Analytics Case Studies HW 5, 6, 7, & 8 7.5% each
Group Presentation 10%
Final (Take-Home Case Study) 35%
Student Deliverables:
Students are expected to produce 11 deliverables by the end of the course.
Assignment Descriptions and Criterion:
post anything related to the assignments online. Failure to abide by the above guidelines may constitute a case of suspected plagiarism or cheating, which will be reported and investigated by USC. Please see the “Academic Integrity and Conduct” section below for further details.
Semester Outline:
Tentative Course Outline: WORK IN PROGRESS
This outline is tentative and subject to change, including the case studies. It will be updated as the course progresses. Interesting links: There is a number of great websites and podcasts that provide additional interesting information. See for instance MIT Machine Learning News, MIT Big Data News, Hitchhiker’s Guide to Python, Towards Data Science, FiveThirtyEight, and Real Python.
First Module: Intro to Python
Week 1: Intro to Python - Twitter JSON data I This week will introduce the course structure, give an introduction to data science and analytics, and introduction to Python. Tuesday Lecture 1/ ∗ Sy llabus.ipy nb ∗ Intr o to Data Science.ipy nb Thursday Lab 1/ ∗ Intr o to P y thon.ipy nb ∗ Twitter JSON example
Week 2: Intro to Python - Twitter JSON data II:
We continue learning the basics of Python in the context of parsing Twitter data in JSON format.
Tuesday Lecture 1/ ∗ Thursday Lab 1/ ∗
CheatSheet Python Cheat Sheet
Second Module: Intro to Pandas
Week 3: Pandas - Uniswap Case Study I:
This week we begin learning how to wrangle with data in Python’s data analysis package Pandas. We will apply our knowledge in the context of a blockchain-based decentralized exchange for pools of cryptocurrencies.
Tuesday Lecture 1/ ∗ Thursday Lab 1/ ∗ CheatSheet Pandas Cheat Sheet I
Week 4: Pandas and EDA - Uniswap Case Study II
We’ll continue learning different aspects of Pandas, aggregation and grouping, and practice conducting exploratory data analysis to gain insights into the Uniswap pool facility data.
Tuesday Lecture 2/ ∗ Thursday Lab 2/ ∗ CheatSheet Pandas Cheat Sheet II
Week 5: Pandas and EDA - Uniswap Case Study III
We will continue with Pandas, merging and concatenating data frames, and collect our insights from our exploratory data analysis with Uniswap data.
Tuesday Lecture 2/ ∗ Thursday Lab 2/
Third Module: Midterm on Intro to Python and Pandas
Week 6: Review and Midterm on Python and Pandas
We will reserve one session for basic Python and data wrangling with Pandas review. There will be an in-class midterm as well.
Tuesday Lecture 2/ ∗ Rev iew .ipy nb Thursday Lab 2/ ∗ Midter m.ipy nb CheatSheet
Midterm Review
Fourth Module: Predictive Analytics Case Studies
Week 7: Intro to Scikit-Learn’s Estimator API - Zillow California Housing Data
Introduction to Scikit-Learn Estimator API and the general scaffolding of a machine learn- ing problem in the context of a Zillow housing price forecasting application.
Tuesday Lecture 2/ ∗ Thursday Lab 2/ ∗ CheatSheet
Scikit-Learn Cheat Sheet I
Week 8: Previous Week Continued
We will continue investigating Zillow housing price data and formulating a business ana- lytics application into running Python code.
Tuesday Lecture 3/ ∗ Thursday Lab 3/ ∗
CheatSheet Scikit-Learn Cheat Sheet II Scikit-Learn Cheat Sheet III
Week 9: Walmart Store Sales Forecasting
We will cover the necessarily analytics tools to answer this week’s business application.
Tuesday Lecture 3/ ∗ Thursday Lab 3/
Fifth Module: Presentations, Review, Take-Home Final Case Study
Week 15: Presentations
We will have each group present their findings to the class, as well as share the mock interview on the course drive.
Tuesday Lecture 4/ ∗ Presentations I Thursday Lab 4/ ∗ Presentations II
Week 16: Course Review and Intro to GitHub
To wrap up the course, we will spend time reflecting on our case studies and the techniques we have studied to answer analytics questions, which are posed to add value for a company. Additionally, we will give a brief introduction to GitHub and Git to demonstrate how you might showcase your skills to potential employers.
Tuesday Lecture 4/ ∗ Thursday Lab 4/ ∗
Week 17: Take-Home Case Study (Final Exam)
Students will complete a case study at home and submit by the required deadline. This is an opportunity to apply the techniques learned throughout the course, as well as techniques you have learned on your own, and take ownership of a project you could potentially put on your GitHub and show off to employers.
University Scheduled Final Exam Day 5/5 - 4:30 to 6:30pm ∗ We will use this period to discuss the final take-home exam and how it might be used in future interviews.
Academic Conduct: Plagiarism – presenting someone else’s ideas as your own, either verbatim or recast in your own words – is a serious academic offense with serious consequences. Please familiarize yourself with the discussion of plagiarism in SCampus in Part B, Section 11, “Behavior Violating Univer- sity Standards”. Other forms of academic dishonesty are equally unacceptable. See additional information in SCampus and university policies on Research and Scholarship Misconduct. Students and Disability Accommodations: USC welcomes students with disabilities into all of the University’s educational programs. The Office of Student Accessibility Services (OSAS) is responsible for the determination of appropriate accommodations for students who encounter disability-related barriers. Once a student has completed the OSAS process (registration, initial appointment, and submitted doc- umentation) and accommodations are determined to be reasonable and appropriate, a Letter of Accommodation (LOA) will be available to generate for each course. The LOA must be given to each course instructor by the student and followed up with a discussion. This should be done as early in the semester as possible as accommodations are not retroactive. More information can be found at osas.usc.edu. You may contact OSAS at (213) 740-0776 or via email at osasfrontdesk@usc.edu. Support Systems: Counseling and Mental Health - (213) 740-9355 – 24/7 on call studenthealth.usc.edu/counseling Free and confidential mental health treatment for students, including short-term psychotherapy, group counseling, stress fitness workshops, and crisis intervention. National Suicide Prevention Lifeline - 1 (800) 273-8255 – 24/7 on call suicidepreventionlifeline.org Free and confidential emotional support to people in suicidal crisis or emotional distress 24 hours a day, 7 days a week. Relationship and Sexual Violence Prevention Services (RSVP) - (213) 740-9355(WELL), press “0” after hours – 24/7 on call studenthealth.usc.edu/sexual-assault Free and confidential therapy services, workshops, and training for situations related to gender- based harm. Office for Equity, Equal Opportunity, and Title IX (EEO-TIX) - (213) 740- eeotix.usc.edu Information about how to get help or help someone affected by harassment or discrimination, rights of protected classes, reporting options, and additional resources for students, faculty, staff, visitors, and applicants.