Student Research Paper
Dr. Peilong Li
The term ‘Broadway’ refers to the live theater performances, either plays or musicals, that take place in the 41 professional 500-seat-or-more theaters located in the Theater District and Lincoln Center in New York City, NY. The data utilized originated from Playbill.com, and it supplies detailed Broadway grosses broken down by week, theater, and individual shows dating back to 1985. To supplement this, data for two other datasets was collected. The first dataset consists of Tony Award wins broken down by show (close to 800 total plays and musicals) for each Broadway season (year) dating back to the year 1997. The second dataset includes opinions and opening night/opening week reviews of shows scraped from the New York Times website for executing sentiment analysis. Using all of these datasets together, it possible to find what key factors produce high sales and long runs for shows on Broadway.
Nace, Allyson K., "haMLton: Gross Box Office and Sentiment Analysis for Broadway Shows" (2021). Computer Science: Student Scholarship & Creative Works. 3.
Honors Senior Thesis; Honors in the Discipline; DS 495 Data Science Capstone