Project Overview: README.md
Data Gathering: consolidate_data.ipynb
Analysis: EDA.working_data.ipynb
Tableau Dashboard:
Letterboxd Data Analysis Project Documentation
This is my Letterboxd Data Analysis Project! This project explores my detailed movie-watching habits using data extracted from my Letterboxd diary and enriched with additional metadata from the TMDb API. The analysis delves into various aspects such as genre popularity, director and actor insights, and correlations between movie ratings and their characteristics.
Project Overview
Objectives
Tools Used
Images consist of screenshots of Python output or ad-hoc matplotlib graphs, and the graphs I made for my Tableau dashboard.
Exploratory Data Analysis
To start my data exploration, let’s get a zoomed out overview of the data we’re working with by getting a general summary of its scope. (If you’re interested in seeing the code, view it here!)
Within the 1094 days between my first and last entry, I watched 326 movies. That seems high! 2.38% of those days I was watching movies. 61 of those days I watched more than two movies, which might’ve helped up the count considering I only really watch movies sparingly.
As expected, I've watched more movies that were released in the last 20 years. Let's now figure out what years within the timeframe I've watched the most movies. I remember 2023 being a movie-heavy year -- I made a conscious effort to watch more movies that year and hit 100 movies. Let's see if that's true.
Because of the goal I set for myself in 2023, there was a considerable spike in movies watched. 2024 is also shaping up to be movie-heavy since the data ends May 2024. The first quarter is skewed since this data includes entries from Jan-May 2024. Let's now see my ratings distribution. I'm curious to see if I'm a harsh critic or not.
I feel like I need to watch more "bad" movies. I'm not sure if I'm a harsh critic or if I just have good taste. Newer movies do not necessarily receive higher ratings than older ones.
Pretty high rewatch count! Let's see if I rate movies differently on rewatch.
I tend to rate movies higher on rewatch. I think this is because I know what to expect and can appreciate the movie more. I also tend to rewatch movies I like, so that could be a factor as well. Except for the movie Imaginary (which is an error in the data), Typically, I've only rewatched movies I've rated more than 3 stars. I think I should rewatch some of the movies I've rated lower to see if my opinion changes, like Downsizing which went from 2.0 to 2.5. Train to Busan lost a star on rewatch, I wonder why. I remember liking it a lot the first time I watched it.
Let's now see what days I watched movies the most. I'm curious to see if I have a movie-watching pattern.
Seems like I had an animation-heavy day on 2021-07-03 ! I also remember the 2nd day to be a particularly lazy one and I wanted to rewatch movies and not delve into new ones. These days, I remember, I set to be a movie marathon day.
Now that we've prepared our director data, let's see who my favorite directors are.
My top directors tend to be style-heavy directors. People who have a distinct style that I particularly enjoy. I'm surprised Edgar Wright isn't in the top. I guess I haven't watched a lot of his movies in this timeframe; I enjoy his direction and typically think of him when I think of comedic editing. Let's see the data for actors now.
Heavy-hitters for sure. Mostly A-listers which is expected. I finally put a name to the face for some of the actors I've seen in multiple movies (David Dastmalchian!). I'm not surprised to see a lot of Marvel actors in the list. I've watched a lot of Marvel movies in this timeframe. Willem Dafoe is a surprise, I didn't realize I've watched so many movies with him in it. Let's see the data for genres now.
Dramas and Comedies I've watched the most! As expected, Horror is pretty far down since I'm not the biggest fan. I really need to watch more Documentaries and Westerns.
As expected, most movies I watched originate from the US. I should watch more PH movies to support local cinema.
Personal High Ratings
Let's explore the movies I've rated 4.5 and above. I'm curious to see if there are any patterns or if I can find any interesting insights.
Despite the low count, Music genre seems to be better rated when compared to the total genre count. Same with War, History, and Mystery. Maybe these are genres I should watch more of. Let's see PHR data for directors.
Richard Linklater feels correct. I loved the Before Trilogy. Quentin Tarantino was at the top for director_counts, but he only has 1 movie in the 4.5+ rating. I guess I don't rate his movies super high. Let's see the same data for actors.
Despite Samuel L. Jackson being in a lot of movies, he's not rated very highly and is included in only 3 highly-rated movies out of 13 total movies. I think this is because he's in a lot of movies; some of them are bound to be bad. Linda Cardellini, Kyle Bornheimer, and Haruka Abe I'm not aware of, but in the three movies I saw them in, I rated them highly. I should seek more of their movies.
Concluding Insights:
1. Targeted Genre Exploration:
- Discovery: While Drama, Comedy, Action, Science Fiction, Adventure, Romance, Thriller, and Crime dominate my viewing habits, I've developed a higher appreciation for less-watched genres like Music, War, History, and Mystery.
- Action: Intentionally seek out and watch more films from these underrepresented genres to diversify my cinematic experience and potentially discover new favorites.
2. Rewatch Strategy:
- Discovery: My ratings generally increase on rewatch, particularly for films initially rated above 3 stars, while some like “Train to Busan” rated lower. This insight is complemented by the finding that movies with initially high ratings are more likely to be rewatched and often maintain or improve in ratings.
- Action: Implement a strategy to rewatch films I rated lower than 3 stars to see if my perceptions change over time, providing deeper insights into my rating patterns and preferences.
3. Director and Actor Emphasis:
- Discovery: Quentin Tarantino, and Denis Villeneuve are the most watched directors. Samuel L. Jackson, Brad Pitt, J.K. Simmons, Chris Pratt, Scarlett Johansson, and Willem Dafoe are the most watched actors.
- Action: Explore more films by directors with a distinct style, especially those I enjoy but haven’t watched much. Also, consider watching more films featuring under-appreciated actors who have positively surprised me.
4. Support for Local Cinema:
- Discovery: There's a predominance of US-origin films in my watch history. Making up 85.58% of the entries, while PH-origin movies are only about 3.68%.
- Action: Actively seek out films produced in the Philippines to balance my viewing habits and support local industry growth.
5. High-Rating Patterns:
- Discovery: High ratings are often given to genres like Music, War, History, Mystery, Family, and Drama; directors like Christopher Nolan, Richard Linklater, Richard Curtis, Luca Guadagnino, and Denis Villeneuve have at least two watched-moves that were highly rated; actors like Michael Stuhlbarg, Ethan Hawke, Jake Gyllenhaal, Dave Bautista, Linda Cardellini, Kyle Bornheimer, Haruka Abe, Tom Stourton, Julie Delpy, Angela Bassett, and Domhnall Gleeson are highly rated.
- Action: Dive into movies with these specific attributes to see if my interest aligns with these high ratings. If not, I could normalize my data to provide a deeper understanding of my preferences.
Project Overview: README.md
Data Gathering: consolidate_data.ipynb
Analysis: EDA.working_data.ipynb
Tableau Dashboard: