Project Overview: README.md
Data Gathering: consolidate_data.ipynb
Analysis: EDA.working_data.ipynb
Tableau Dashboard:
Letterboxd Data Analysis Project Documentation
This is my Letterboxd Data Analysis Project! This project explores my detailed movie-watching habits using data extracted from my Letterboxd diary and enriched with additional metadata from the TMDb API. The analysis delves into various aspects such as genre popularity, director and actor insights, and correlations between movie ratings and their characteristics.
Project Overview
Objectives
Tools Used
Images consist of screenshots of Python output or ad-hoc matplotlib graphs, and the graphs I made for my Tableau dashboard.
Exploratory Data Analysis
To start my data exploration, let’s get a zoomed out overview of the data we’re working with by getting a general summary of its scope. (If you’re interested in seeing the code, view it here!)
Within the 1094 days between my first and last entry, I watched 326 movies. That seems high! 2.38% of those days I was watching movies. 61 of those days I watched more than two movies, which might’ve helped up the count considering I only really watch movies sparingly.
As expected, I've watched more movies that were released in the last 20 years. Let's now figure out what years within the timeframe I've watched the most movies. I remember 2023 being a movie-heavy year -- I made a conscious effort to watch more movies that year and hit 100 movies. Let's see if that's true.
Because of the goal I set for myself in 2023, there was a considerable spike in movies watched. 2024 is also shaping up to be movie-heavy since the data ends May 2024. The first quarter is skewed since this data includes entries from Jan-May 2024. Let's now see my ratings distribution. I'm curious to see if I'm a harsh critic or not.
I feel like I need to watch more "bad" movies. I'm not sure if I'm a harsh critic or if I just have good taste. Newer movies do not necessarily receive higher ratings than older ones.
Pretty high rewatch count! Let's see if I rate movies differently on rewatch.
I tend to rate movies higher on rewatch. I think this is because I know what to expect and can appreciate the movie more. I also tend to rewatch movies I like, so that could be a factor as well. Except for the movie Imaginary (which is an error in the data), Typically, I've only rewatched movies I've rated more than 3 stars. I think I should rewatch some of the movies I've rated lower to see if my opinion changes, like Downsizing which went from 2.0 to 2.5. Train to Busan lost a star on rewatch, I wonder why. I remember liking it a lot the first time I watched it.
Let's now see what days I watched movies the most. I'm curious to see if I have a movie-watching pattern.
Seems like I had an animation-heavy day on 2021-07-03 ! I also remember the 2nd day to be a particularly lazy one and I wanted to rewatch movies and not delve into new ones. These days, I remember, I set to be a movie marathon day.
Now that we've prepared our director data, let's see who my favorite directors are.
My top directors tend to be style-heavy directors. People who have a distinct style that I particularly enjoy. I'm surprised Edgar Wright isn't in the top. I guess I haven't watched a lot of his movies in this timeframe; I enjoy his direction and typically think of him when I think of comedic editing. Let's see the data for actors now.
Heavy-hitters for sure. Mostly A-listers which is expected. I finally put a name to the face for some of the actors I've seen in multiple movies (David Dastmalchian!). I'm not surprised to see a lot of Marvel actors in the list. I've watched a lot of Marvel movies in this timeframe. Willem Dafoe is a surprise, I didn't realize I've watched so many movies with him in it. Let's see the data for genres now.
Dramas and Comedies I've watched the most! As expected, Horror is pretty far down since I'm not the biggest fan. I really need to watch more Documentaries and Westerns.
As expected, most movies I watched originate from the US. I should watch more PH movies to support local cinema.
Personal High Ratings
Let's explore the movies I've rated 4.5 and above. I'm curious to see if there are any patterns or if I can find any interesting insights.
Despite the low count, Music genre seems to be better rated when compared to the total genre count. Same with War, History, and Mystery. Maybe these are genres I should watch more of. Let's see PHR data for directors.
Richard Linklater feels correct. I loved the Before Trilogy. Quentin Tarantino was at the top for director_counts, but he only has 1 movie in the 4.5+ rating. I guess I don't rate his movies super high. Let's see the same data for actors.
Despite Samuel L. Jackson being in a lot of movies, he's not rated very highly and is included in only 3 highly-rated movies out of 13 total movies. I think this is because he's in a lot of movies; some of them are bound to be bad. Linda Cardellini, Kyle Bornheimer, and Haruka Abe I'm not aware of, but in the three movies I saw them in, I rated them highly. I should seek more of their movies.