Topic: Re-Weighting the IMDB 250

Hey y'all. This came up in the chat, and I decided to fuck around with it and see what I could do:

So, the IMDB Top 250 seems to lean disproportionately towards recent films, so I decided to see if I could account for this somewhat. Disclaimer: It's a Saturday and I have been drinking, so I've not been super-rigorous about this, but hey, fuck it, what do you want from me?

So, here's the top 25, as of today. Here's the post-2000s:
Dark Knight (2008) at number 4.
Lord of The Rings (2003, 2001) at 9 and 11, respectively.
Inception (2010) at 14.
Lord of The Rings (2002) at 16.
Interstellar (2014! wtf?!?) at 20.
City of God (2002) at 22. (I do really like this one, though.)

SPOILER Show
http://i.imgur.com/JYOsIsJ.png

Okay. If you take the 250 and do a histogram by years, you can see that there are a disproportionate number of post-90s films. Obviously, there's a lot of reasons for this, and one of them is that Gen-Y and Millenials are more internet savvy so are going to be voting for more recent movies. Since a movie's score is weighted by the number of votes it gets, it will tend to bias films with more votes (i.e. recent ones).

SPOILER Show
http://i.imgur.com/hjLyX7X.png

Disclaimer: So, there actually isn't a significant relationship (linear regression) between year and score, which is unexpected. However, this is complicated by the fact that all the scores are bound between 8.2 and 9.2, and that there are 94 years worth of movies (among other issues). Let's just continue as if this bias was detectable.
There IS a significant relationship between year and the frequency of appearances on the list, with a slope of about 5%.

SPOILER Show
http://i.imgur.com/ol40366.png

So, year has a 5% contribution, to some extent. First attempt is to adjust the score of the film, by weighting it by year. Oldest film (1921) retains its whole score, newest films (2014) get only 95% of their score. I tried this linearly.

SPOILER Show
http://i.imgur.com/LH8qmBE.png

Slight improvement. Only 3 post-2000 films in the top 25, this time:
Return of The King dropped from 9 to 10.
Dark Knight from 4 to 13.
Fellowship from 11 to 19.

SPOILER Show
http://i.imgur.com/4fJuBz8.png

But, I think a linear weighting works badly, since it's also going to be having an effect (even though it's small, it will affect the ranking) on films made in the 60s/70s. So, we need a fairer weighting.
Here's the .csv of these top 250 if you wanna take a look:
http://www.filedropper.com/linearimdbtop250

Exponential weighting. Look, I'll be honest with y'all... dealing with exponential distributions when I'm sober and focused is bad enough as it is. It's trial-and-error at the best of times, and I don't really understand this "vector of quantiles" malarkey. I have been drinking. So I just fucked around with the numbers until the curve and the axes looked reasonable. It's not perfect, and I would've liked the penalty on post-2000s to be higher, and the slope on the pre-2000s to be shallower, but, whatever, fuck it.

SPOILER Show
http://i.imgur.com/1mndecV.png

Hmm. Your mileage may vary. I think it could do better with a different exponential weighting, but I can't be arsed. The Dark Knight and LOTR are still too goddamn high, but, whatever.
The important thing is that 12 Angry Men is closer to number 1.

SPOILER Show
http://i.imgur.com/e6TWO1C.png

Anyway, here's the .csv of the re-weighted top 250:
http://www.filedropper.com/exponentialimdbtop250
It's an interesting re-ordering, and arguably would cause less arguments than the current 250. Maybe one day I'll try with a better exponential curve.


Disclaimer: I'm a professional, but not at - oh shit, wait, I am a professional at this. When I'm sober. I'm not a professional right now though.

Disclaimer: if you dislike the tone of a post I make, re-read it in a North/East London accent until it sounds sufficiently playful smile

Re: Re-Weighting the IMDB 250

Interesting.

This would somewhat balance out the fact that people who may be more inclined to like or love a film tend to watch it closer to release. Depending on the factors one considers I could see this being a wrong approach aswell, the type of weighting as pointed out probably only depends on what the individual thinks looks good as it is re-weighted, factoring in their biases and previous reference points.

Thumbs up Thumbs down

Re: Re-Weighting the IMDB 250

Yeah, and all critics struggle with the perspective problem. It's easy to forget how great The Godfather or The Sting is if you haven't seen it in 20 years. As much as I love Hitchcock, he probably gets overrated because his movies are highly rewatchable. (But if you consider rewatchability part of what makes a film great, then he's probably correctly rated.)

For that and similar reasons, I don't put much stock in any film list that isn't sufficiently narrow to provide context and perspective. Of course, I'm not a list maker, myself.

Last edited by Zarban (2015-02-08 22:52:34)

Warning: I'm probably rewriting this post as you read it.

Zarban's House of Commentaries