Thanks everyone for using Quizbowl DB so much.
We now have almost 37000 buzzes on questions and over 200 users!
I can safely say that I have accumulated a lot of statistics.
Now the time has come for a ratings/leaderboard formula, so I would like to ask everyone to help with this effort.
Information that I keep track of:
1. user that buzzed in
2. question (and all relevant question information) that they buzzed in on
3. where they buzzed in on (the score) (ranges >0)
4. if they got it right or wrong (1 or 0 respectively)
Using this information, I have created another table that contains data about tournaments. It contains:
1. tournament name (ie 2009 DAFT)
2. average accuracy of tournament (ranges 0-1)
3. average score (earliness of buzz) of tournament (ranges >0)
This tournament difficulty ranking is actually quite accurate. It always puts the Chicago Opens at the most difficulty and the CMST as the easiest. Note that these tournament rankings are live and change with every buzz that someone does. It actually provides quite insightful data as to how standardized a tournament's difficulty is.
I decided I should rank each buzz that you do in relation to the average for the tournament of that buzz. For example, if you buzzed in after 25 words on the Collaborative MS tournament (which has an avg score of about 25), you would get significantly less points for that question than if you buzzed on after 25 words on the 2010 Chicago Open (avg score of 115). Similarly, if you get a question wrong on Chicago Open (accuracy of 34%), you should penalized less than if you negged a question in CMST (accuracy of 70%).
My current formula for a player's "rating" is somewhat convoluted:
(Note: correct is 0 if wrong and 1 if correct)
Score for a single question: (correct-tournament_accuracy)*(tournament_avgscore/score)
Player rating = Sum(question_scores)*Log(# of questions buzzed in on,50)
This seems really weird, but basically, I'm trying to reward people who have buzzed in a lot, rather than people who have just buzzed in on 2 questions and gotten them right.
A trend I noticed from this formula is that it highly rewards accuracy over how early you buzz in, because negging in fact lowers your rating.
Another thing is that about half the people in the database have negative ratings and the other half has positive.
Any comments/suggestions for a better formula? I really suck at math and hope that someone better can fix errors and come up with a better one.
I feel that a lot of this data will be highly useful to the quizbowl community.
EDIT: updated formula