Announcing BEeS: A Better Electronic Stats program

Dormant threads from the high school sections are preserved here.
User avatar
theMoMA
Forums Staff: Administrator
Posts: 5993
Joined: Mon Oct 23, 2006 2:00 am

Announcing BEeS: A Better Electronic Stats program

Post by theMoMA »

I'm thrilled to announce that sometime in the coming semester, Sean Skaar and I will debut BEeS, a new statistics and tournament organizing program that will revolutionize the way we schedule and organize tournaments, and how we record and view statistics.

BEeS consists of three programs: the Tournament Director (BTD), Quizbowl Reader (BQR) and Statistics Viewer (BSV). BTD parses the questions from their raw packet form into a markup that can be read by the BQR in a painless, user-friendly process that includes easy error correction. BTD is also a powerful tournament directing program with an integrated automatic scheduler. The tournament file generated from BTD that is loaded into the BQR on each moderator's computer includes all questions and scheduling information; not only is it impossible for moderators to read the wrong round, the moderators simply choose their room from the list that the tournament director enters at the beginning of the scheduling process and all of the already-scheduled matchups automatically pop up on the screen. In other words, the program will tell the moderator which teams are supposed to be in the room. If you have rosters ahead of time, the names of players will be included. If not, no worries. You can use our easy-to-use correction options if one player is entered in as "Eric M." and "Eric Mukherjee" in different rounds.

Each tournament computer must have BQR installed; the single screen of that program incorporates an intuitive scorekeeping system with the questions themselves. Each moderator computer will create an input file that can be loaded into BTD at any time (after a round, after a prelim bracket concludes, etc) for to-the-minute updates. We are also working to support automatic updates and other functionality via wireless network.

The BSV is non-essential to running tournaments, but it is a free-to-download program that allows any interested parties to load the raw data file from a completed tournament for powerful data manipulation. The BTD also produces a sortable html web report with any stats the user desires.

The truly revolutionary thing about BEeS is the information that our interface lets us painlessly collect. In order to give a player credit for answering a tossup, the moderator must click on the word on which the player buzzed. In other words, we've found a way to get advanced data on where players buzz on each question. The moderator must also award bonus points on a part-by-part basis. No longer will bonus answerability be a chore to calculate; BEeS does it automatically, and allows you to sort questions by advanced answerability metrics. Want to see which bonus parts were the least converted? Which tossups were the most negged? Which tossups went dead the most? It's all there.

BEeS also supports (actually, encourages) the use of markup languages for questions. NAQT's QBML or HSAPQ's markup language are both valid inputs to the parser. In addition to making it much easier to parse questions, using markup language allows BEeS to calculate team and individual stats by category, subcategory, time period, and other exciting ways to sort things. Want to know who is getting the most points on American History? World Literature? Questions from the 1800s? If you use a markup language, you can.

That pretty much sums up what BEeS can do. Please check out http://www.beesqb.com for more info (my first foray into CSS, so please tell me if there are any problems with the site). We're currently in the testing phase, and hope to have a beta version for a trial run at MUT (March 7). If that goes well, expect BEeS to debut on the circuit soon thereafter. Some of the links are currently not working. I actually have set up our ordering system through PayPal, but thought it best not to let people pay us until we had something available, so those links aren't available right now. Links to specific screenshots, tutorial videos, and sample tournaments will get you 404'd right now, but I'll add them as soon as I can. Everything else should work, so let me know if it doesn't (or if it looks crappy).

Stats contest

Since the data that BEeS can provide has never before been available, we're looking for new stats that use that data. I've already invented one, Hap (Hart adjusted points; insert vengeful god jokes here), calculated like this:

10 * [a / n] = Hap

Where "a" is a measure of the rank that the player buzzed relative to other players at the tournament, and n is the total number of rooms. For example, if there are 10 rooms, and your buzz was the 3rd fastest, you get 8 Hap for that buzz, because across all rooms, you had an 80% chance of getting 10 points. Sum up all Hap and divide by games played to get HapPG.

The contest will go something like this: invent a stat that we think is cool, and we will automatically calculate it in the program and name it after you. You will also get a $5 coupon for a one-time BEeS license, or a $15 coupon for a personal yearly license.

Please send all submission to [email protected].
Andrew Hart
Minnesota alum
cdcarter
Yuna
Posts: 945
Joined: Thu Nov 15, 2007 12:06 am
Location: Minneapolis, MN
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by cdcarter »

Since I have worked on stuff like this but never finished several times, a few suggestions:
* Integration with a swiss pair system. After a round, next room shows up for the moderator to announce. Would allow a better way of keeping track of a card system
* Stats exportable to SQBS, HTML, and .xls, full rounds able to be exported to XML would be awesome too. Unless (and I am guessing you aren't for this) you just open source the whole deal so we can play
* Timer? NSC format? Bouncebacks? IL Bonuses?

And of course, a stat. For each buzz let percent_left = 1 - (buzz_point/tossup length). Averaged over tournament and subject areas. It's a simple measure of depth
Christian Carter
Minneapolis South High School '09 | Emerson College '13
PACE Member (retired)
User avatar
theMoMA
Forums Staff: Administrator
Posts: 5993
Joined: Mon Oct 23, 2006 2:00 am

Re: Announcing BEeS: A Better Electronic Stats program

Post by theMoMA »

Thanks for the suggestions, Chris. I should have said before that we are actively soliciting requests for features and options to make BEeS as useful as possible, so if anyone else has suggestions, please post them here.
Andrew Hart
Minnesota alum
User avatar
dtaylor4
Auron
Posts: 3733
Joined: Tue Nov 16, 2004 11:43 am

Re: Announcing BEeS: A Better Electronic Stats program

Post by dtaylor4 »

When the moderator clicks on a word, will the program automatically know whether the tossup was powered, and if so, how will the moderator know?

Also, the HapPG statistic you mention should account for powers. Of course, the Hap per tossup should depend on how many points the person got, not the maximum.
User avatar
Important Bird Area
Forums Staff: Administrator
Posts: 6113
Joined: Thu Aug 28, 2003 3:33 pm
Location: San Francisco Bay Area
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by Important Bird Area »

dtaylor4 wrote:When the moderator clicks on a word, will the program automatically know whether the tossup was powered, and if so, how will the moderator know?
The program certainly will, if it takes QBML as an input. I presume there is some kind of reminder to the moderator? (Up to and including comedy FIFTEEEEEEEN audio option?)
Jeff Hoppes
President, Northern California Quiz Bowl Alliance
former HSQB Chief Admin (2012-13)
VP for Communication and history subject editor, NAQT
Editor emeritus, ACF

"I wish to make some kind of joke about Jeff's love of birds, but I always fear he'll turn them on me Hitchcock-style." -Fred
User avatar
BGSO
Tidus
Posts: 685
Joined: Sat Aug 11, 2007 12:36 pm
Location: Champaign-Urbana and Arlington heights IL

Re: Announcing BEeS: A Better Electronic Stats program

Post by BGSO »

A vulch stat? Say if the question is answered incorrectly by one team, and the person on the other team doesn't wait for the question to be over, it is considered a vulch?

Something that kind of spawns off of that is a stat that maybe tracks how many TU's you answer off of rebounds vs. you answer first, I don't have a good name for it but it's just an idea.
David Garb-
Buffalo Grove High School '09
UIUC-'13

Former member of the most dysfunctional scholastic bowl team in Illinois.
(11:23:30 PM) garb: Wait, are you talking about the porn or the reeses?
User avatar
Mike Bentley
Sin
Posts: 6461
Joined: Fri Mar 31, 2006 11:03 pm
Location: Bellevue, WA
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by Mike Bentley »

Sounds cool. I'd be willing to offer my skills as a "professional Software Development Engineer in Test" to beta testing this.
Mike Bentley
Treasurer, Partnership for Academic Competition Excellence
Adviser, Quizbowl Team at University of Washington
University of Maryland, Class of 2008
User avatar
Sen. Estes Kefauver (D-TN)
Chairman of Anti-Music Mafia Committee
Posts: 5647
Joined: Wed Jul 26, 2006 11:46 pm

Re: Announcing BEeS: A Better Electronic Stats program

Post by Sen. Estes Kefauver (D-TN) »

Points-per-volume of facial hair.
Charlie Dees, North Kansas City HS '08
"I won't say more because I know some of you parse everything I say." - Jeremy Gibbs

"At one TJ tournament the neg prize was the Hampshire College ultimate frisbee team (nude) calender featuring one Evan Silberman. In retrospect that could have been a disaster." - Harry White
User avatar
Whiter Hydra
Auron
Posts: 1418
Joined: Tue Dec 04, 2007 8:46 pm
Location: Fairfax, VA
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by Whiter Hydra »

Buzzer Race Spot -- a word in which >35% of rooms buzz in on a question?

Also, great idea. Especially with the integration of the packets and the stats. I've been thinking about trying something similar for the MOHIT, if I weren't so busy editing questions.
Harry White
TJHSST '09, Virginia Tech '13

Owner of Tournament Database Search and Quizbowl Schedule Generator
Will run stats for food
User avatar
BuzzerZen
Auron
Posts: 1517
Joined: Thu Nov 18, 2004 11:01 pm
Location: Arlington, VA/Hampshire College

Re: Announcing BEeS: A Better Electronic Stats program

Post by BuzzerZen »

So, like, are you going to be writing a native client for each OS, or is this going to be employing everybody's least-favorite cross-platform programming environment, Java?

Edit: Also, i call shenanigans.
Evan Silberman
Hampshire College 07F

How are you actually reading one of my posts?
User avatar
Sima Guang Hater
Auron
Posts: 1958
Joined: Mon Feb 05, 2007 1:43 pm
Location: Nashville, TN

Re: Announcing BEeS: A Better Electronic Stats program

Post by Sima Guang Hater »

You could implement some measure of generalist ability, i.e. what percentage of questions outside of one's best category someone gets. Or maybe sort the category percentages from low to high on a scatter plot, and measure the y-intercept of the best-fit line. The higher it is, the better generalist the person is. You might have to implement a correction for having a specialist teammate, though.
Eric Mukherjee, MD PhD
Brown 2009, Penn Med 2018
Instructor/Attending Physician/Postdoctoral Fellow, Vanderbilt University Medical Center
Coach, University School of Nashville

“The next generation will always surpass the previous one. It’s one of the never-ending cycles in life.”
Support the Stevens-Johnson Syndrome Foundation
User avatar
Mechanical Beasts
Banned Cheater
Posts: 5673
Joined: Thu Jun 08, 2006 10:50 pm

Re: Announcing BEeS: A Better Electronic Stats program

Post by Mechanical Beasts »

The Quest for the Historical Mukherjesus wrote:You could implement some measure of generalist ability, i.e. what percentage of questions outside of one's best category someone gets. Or maybe sort the category percentages from low to high on a scatter plot, and measure the y-intercept of the best-fit line. The higher it is, the better generalist the person is. You might have to implement a correction for having a specialist teammate, though.
I like the latter idea better, but I'm not sure precisely what you mean--are "category percentages" just the proportion each subject makes up of your total points? I'd rather compute points earned per available for each category--that's more likely what you meant.

The specialist teammate seems hard to factor in, because it's hard to say whether you would have known the answer at all. (Ted nails a tossup on Kobo Abe, I might have gotten it before our opponent, too; Ted nails a tossup on Unamuno, I might have gotten it on a guess at the giveaway.) Perhaps if you have a teammate who [gets more than x% of his points from subject n | converts more than x% of points available in subject n], then that subject is discarded from your scatter plot entirely since it's a likely outlier? So you're not held responsible for your apparently poor knowledge of soccer, for example.
Andrew Watkins
dschafer
Rikku
Posts: 291
Joined: Fri Apr 29, 2005 8:03 pm
Location: Carnegie Mellon University

Re: Announcing BEeS: A Better Electronic Stats program

Post by dschafer »

I just want to add my support for:
cdcarter wrote: * Stats exportable to SQBS, HTML, and .xls, full rounds able to be exported to XML would be awesome too. Unless (and I am guessing you aren't for this) you just open source the whole deal so we can play
The former would be nice; the latter would be amazingly awesome.
Dan Schafer
Carnegie Mellon '10
Thomas Jefferson '06
User avatar
theMoMA
Forums Staff: Administrator
Posts: 5993
Joined: Mon Oct 23, 2006 2:00 am

Re: Announcing BEeS: A Better Electronic Stats program

Post by theMoMA »

BuzzerZen wrote:So, like, are you going to be writing a native client for each OS, or is this going to be employing everybody's least-favorite cross-platform programming environment, Java?
Well, I'm not going to be writing native clients for anything. I couldn't program my way out of a Jacquard loom (it took me several hours of intense guess and check to figure out how to make css not kill me). I can ask Sean, though.
Andrew Hart
Minnesota alum
User avatar
Matt Weiner
Sin
Posts: 8145
Joined: Fri Apr 11, 2003 8:34 pm
Location: Richmond, VA

Re: Announcing BEeS: A Better Electronic Stats program

Post by Matt Weiner »

Neg leverage:

*All else being equal, a neg on a question that the opponent does not convert is more damaging than a neg on a question that the opponent does convert. This is counterintuitive at first, but it basically means that you only cost yourself 5 points on a question that the other team was going to get anyway if you waited, whereas, if it's a question the other team doesn't know, you cost yourself the chance to not lose 5, pick up 10, and hear the bonus because you could have let it go all the way to the end.

*Negs are more damaging based on how well your teammates know the category, which the program can also calculate from the performance-by-category statistics.

Negs can thus be taken into account based on how many points they actually cost your team. For example, if an average European history tossup has a 75% chance of being answered correctly by your team, and your team averages 18 points per bonus, then a neg on European history costs your team 26 points (5 for the neg, 21 for the 75% chance of getting 10 on a tossup and 18 on a bonus). There should also be some correction for whether the opponent answered the tossup (basically, scaling the neg down based on how likely it was for anyone to get the question at all, tournament-wide) but I'm not sure how to implement that.

Subject leverage:

Similarly, players/teams should get more credit in subject performance based on how strong the opponents are in each subject. If your 20 PPB opponent normally gets 80% of science questions, but you go 4/0 on the science, then you've prevented 96 likely points; if your 15 PPB opponent normally gets 20% of science questions, and you go 4/0, you've only prevented 20 points. By doing this for each player on each subject in each game you can build a total player effectiveness rating based on "expected points prevented," that inherently takes into account quality of opponents; like the Litvak line but more advanced.
Matt Weiner
Advisor to Quizbowl at Virginia Commonwealth University / Founder of hsquizbowl.org
User avatar
Maxwell Sniffingwell
Auron
Posts: 2164
Joined: Sun Feb 12, 2006 3:22 pm
Location: Des Moines, IA

Re: Announcing BEeS: A Better Electronic Stats program

Post by Maxwell Sniffingwell »

A list of the "10 best buzzes" of the tournament - the fastest buzzes of the tournament and who got them on what answer.
Greg Peterson

Northwestern University '18
Lawrence University '11
Maine South HS '07

"a decent player" - Mike Cheyne
User avatar
Alejandro
Wakka
Posts: 226
Joined: Mon Jul 10, 2006 8:39 pm
Location: Seattle, WA

Re: Announcing BEeS: A Better Electronic Stats program

Post by Alejandro »

This sounds really cool.
As for adding statistics, I think it'd be great if you could define your own custom statistics using something like an equation editor; i.e. something that would allow you to do

Code: Select all

 $TU_PLAYER * 10 - $NEGS_PLAYER * 5
to find out the PPG for each player.
Alejandro
Naperville Central '07
Harvey Mudd '11
University of Washington '17
User avatar
Auroni
Auron
Posts: 3145
Joined: Thu Nov 15, 2007 6:23 pm

Re: Announcing BEeS: A Better Electronic Stats program

Post by Auroni »

Some measure of bonus gradation ability -- you pick one bonus that shows the closest conversion to the conversion desired (this can be set, say 90-40-10) and then there's a differential for each bonus showing how well they came close to this ideal.
Auroni Gupta (she/her)
User avatar
First Chairman
Auron
Posts: 3651
Joined: Sat Apr 19, 2003 8:21 pm
Location: Fairfax VA
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by First Chairman »

Excuse me if it has already been covered. If it has, just ignore this post.

Can we also get a listing of tossups (or bonus parts if we're going to be that sophisticated) that went completely dead, or were on the upper percentile tier of difficulty (only one or two people got)? I know it take Charlie Steinhice an incredible amount of effort to do this for his events.

Of course I'm also someone who wants to see tossups that got answered after an interrupt by the opponent (the "rebound"/vulching effect).
Emil Thomas Chuck, Ph.D.
Founder, PACE
Facebook junkie and unofficial advisor to aspiring health professionals in quiz bowl
---
Pimping Green Tea Ginger Ale (Canada Dry)
User avatar
sam.peterson
Lulu
Posts: 84
Joined: Mon Oct 15, 2007 7:05 pm
Location: Chaska, MN

Re: Announcing BEeS: A Better Electronic Stats program

Post by sam.peterson »

A stock/middle clue knowledge stat that records a player's number of buzzes on the first (and maybe second) word out of power.
Sam Peterson
Harvard College '13
Chaska High School '09
User avatar
Stained Diviner
Auron
Posts: 5085
Joined: Sun Jun 13, 2004 6:08 am
Location: Chicagoland
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by Stained Diviner »

First Buzzes--On each question, keep track of who got the earliest correct answer across the tournament and count up the total for each person. If there is a tie on a question, you can give each person 1/2 (or 1/3, etc).
David Reinstein
Head Writer and Editor for Scobol Solo, Masonics, and IESA; TD for Scobol Solo and Reinstein Varsity; IHSSBCA Board Member; IHSSBCA Chair (2004-2014); PACE President (2016-2018)
dschafer
Rikku
Posts: 291
Joined: Fri Apr 29, 2005 8:03 pm
Location: Carnegie Mellon University

Re: Announcing BEeS: A Better Electronic Stats program

Post by dschafer »

Stats suggestions:
Head-to-Head: play every team against every other team in every round, Goldfish-style, compute W-L accordingly.
The graphs I computed here for tossups
The "buzz percentage" for every word in a question, computed as follows: [number of teams buzzed on that word] / [number of teams still listening to that tossup]. These could then be ordered to show difficulty cliffs.
Category-based difficulties: Show the average PPB for each category, to see if packets are of even difficulty across categories.
Dan Schafer
Carnegie Mellon '10
Thomas Jefferson '06
Oscar Girls Gone Wilde
Lulu
Posts: 9
Joined: Wed Oct 10, 2007 3:22 pm

Re: Announcing BEeS: A Better Electronic Stats program

Post by Oscar Girls Gone Wilde »

My intent is to compile all of the stats into a single file that contains all of the accumulated data from a tournament as well as all of the questions from the tournament, and using the stats display program one can see a graphical overlay of pertinent statistics on top of the actual question set. I'm thinking some highlighting for the upper and lower quartiles of where the questions were answered, ability to see who answered each question first, and where. Any statistic that can be easily derived from the normal quiz bowl stats and the word in the question that an event took place on, be it a neg, power, or correct answer, should be easy to calculate, and interesting statistics that require more information than that may still be considered if I can come up with an elegant implementation for them.

I am currently programming it in Java since it is cross platform and has an excellent mechanism to read from word documents in a platform independent way using only Java code thanks to the Apache foundation. I am using open source libraries, so I will open source the project. I'll probably have a subversion of my framework and the source code available at some point in the near future if anyone is interested in taking a look or contributing, and I'll probably release stable builds periodically for testing as well if anyone is interested in being my guinea pig.
Sean Skaar
University of Minnesota
User avatar
Skepticism and Animal Feed
Auron
Posts: 3238
Joined: Sat Oct 30, 2004 11:47 pm
Location: Arlington, VA

Re: Announcing BEeS: A Better Electronic Stats program

Post by Skepticism and Animal Feed »

NCRedhawks wrote: As for adding statistics, I think it'd be great if you could define your own custom statistics
You win this thread.
Bruce
Harvard '10 / UChicago '07 / Roycemore School '04
ACF Member emeritus
My guide to using Wikipedia as a question source
User avatar
Stained Diviner
Auron
Posts: 5085
Joined: Sun Jun 13, 2004 6:08 am
Location: Chicagoland
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by Stained Diviner »

What Harry describes as a Buzzer Race Spot is actually a Difficulty Cliff--the terms are synonymous. What Dan describes as a Difficulty Cliff is actually just an easy clue, which is not a problem if it comes after moderately easy clues. (If there is one room where the question is still active, and somebody in that room buzzes in with the answer, then that clue has 100% conversion and nothing bad has occurred.) As a question writer/editor, the Buzzer Race Spot/Difficulty Cliff would be a useful statistic for feedback, though it might work best, like most statistics, with a ranking rather than a threshold.
David Reinstein
Head Writer and Editor for Scobol Solo, Masonics, and IESA; TD for Scobol Solo and Reinstein Varsity; IHSSBCA Board Member; IHSSBCA Chair (2004-2014); PACE President (2016-2018)
Oscar Girls Gone Wilde
Lulu
Posts: 9
Joined: Wed Oct 10, 2007 3:22 pm

Re: Announcing BEeS: A Better Electronic Stats program

Post by Oscar Girls Gone Wilde »

Many of these ideas are feasible with the data I'm planning on collecting. Since I'm collecting the location of a buzz, sorting by standard deviation of that statistic would show the questions with the greatest difficulty cliff. Hoses could be exposed through performing a similar operation for net location. Some kind of equation editor for custom statistics isn't completely out of the question, although it would inevitably be less robust than a star I preprogram in myself.
Last edited by Oscar Girls Gone Wilde on Mon Jan 05, 2009 6:33 pm, edited 1 time in total.
Sean Skaar
University of Minnesota
User avatar
grapesmoker
Sin
Posts: 6345
Joined: Sat Oct 25, 2003 5:23 pm
Location: NYC
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by grapesmoker »

This is wonderful news. May I request an XML export option for easy importation into QBDB? It doesn't much matter what the markup is as long as it's intuitive and consistent.

edit: also, can you guys point me to the Java API you use to read Word files?
Jerry Vinokurov
ex-LJHS, ex-Berkeley, ex-Brown, sorta-ex-CMU
presently: John Jay College Economics
code ape, loud voice, general nuissance
User avatar
Stained Diviner
Auron
Posts: 5085
Joined: Sun Jun 13, 2004 6:08 am
Location: Chicagoland
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by Stained Diviner »

For hoses, there could be the hosiest clues that inspired the most negs based on when the buzzes occurred and the hosiest questions that would not depend on when the buzzes occurred.
David Reinstein
Head Writer and Editor for Scobol Solo, Masonics, and IESA; TD for Scobol Solo and Reinstein Varsity; IHSSBCA Board Member; IHSSBCA Chair (2004-2014); PACE President (2016-2018)
User avatar
AlphaQuizBowler
Tidus
Posts: 695
Joined: Mon Dec 03, 2007 6:31 pm
Location: Alpharetta, GA

Re: Announcing BEeS: A Better Electronic Stats program

Post by AlphaQuizBowler »

Shcool wrote:First Buzzes--On each question, keep track of who got the earliest correct answer across the tournament and count up the total for each person. If there is a tie on a question, you can give each person 1/2 (or 1/3, etc).
You could take this First Buzz number, multiply by 10, and divide by games played to get Points Per Solo Game: The number of points someone would get in a hypothetical solo game played against all other tournament players.
William
Alpharetta High School '11
Harvard '15
cdcarter
Yuna
Posts: 945
Joined: Thu Nov 15, 2007 12:06 am
Location: Minneapolis, MN
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by cdcarter »

Oscar Girls Gone Wilde wrote: I am currently programming it in Java since it is cross platform and has an excellent mechanism to read from word documents in a platform independent way using only Java code thanks to the Apache foundation. I am using open source libraries, so I will open source the project. I'll probably have a subversion of my framework and the source code available at some point in the near future if anyone is interested in taking a look or contributing, and I'll probably release stable builds periodically for testing as well if anyone is interested in being my guinea pig.
I would love to help test/debug/whatever, and I will look forward to seeing anything you have. More quizbowl stuff should be F/OSS
Christian Carter
Minneapolis South High School '09 | Emerson College '13
PACE Member (retired)
User avatar
Captain Sinico
Auron
Posts: 2675
Joined: Sun Sep 21, 2003 1:46 pm
Location: Champaign, Illinois

Re: Announcing BEeS: A Better Electronic Stats program

Post by Captain Sinico »

Shcool wrote:What Harry describes as a Buzzer Race Spot is actually a Difficulty Cliff--the terms are synonymous. What Dan describes as a Difficulty Cliff is actually just an easy clue, which is not a problem if it comes after moderately easy clues.
I think you're having an issue with terms here. A difficulty cliff is an easy clue that comes without moderately easy clues before it, by definition. The ease of a clue is measured by how many people know it and nothing else.

MaS
Mike Sorice
Former Coach, Centennial High School of Champaign, IL (2014-2020) & Team Illinois (2016-2018)
Alumnus, Illinois ABT (2000-2002; 2003-2009) & Fenwick Scholastic Bowl (1999-2000)
Member, ACF (Emeritus), IHSSBCA, & PACE
User avatar
Whiter Hydra
Auron
Posts: 1418
Joined: Tue Dec 04, 2007 8:46 pm
Location: Fairfax, VA
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by Whiter Hydra »

By the way, how will it handle protests?
Harry White
TJHSST '09, Virginia Tech '13

Owner of Tournament Database Search and Quizbowl Schedule Generator
Will run stats for food
User avatar
Alejandro
Wakka
Posts: 226
Joined: Mon Jul 10, 2006 8:39 pm
Location: Seattle, WA

Re: Announcing BEeS: A Better Electronic Stats program

Post by Alejandro »

grapesmoker wrote:also, can you guys point me to the Java API you use to read Word files?
My guess is that he's using this.
Also, I too would like to test this out when it's ready.
Alejandro
Naperville Central '07
Harvey Mudd '11
University of Washington '17
User avatar
cvdwightw
Auron
Posts: 3291
Joined: Tue May 13, 2003 12:46 am
Location: Southern CA
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by cvdwightw »

What kind of statistics thread could it be without Dwight throwing more harebrained nonsense around? Why, no kind of statistics thread at all!

My kind of "dream" statistic is an Individualized Subject Rating; this measures how good a player is at each individual category (through breadth and depth). Furthermore, this kind of statistic can be carried over from tournament to tournament.

I think that this program will have the capability to quickly compute all the necessary data needed to derive this statistic. However, this derivation may be quite time-intensive (have to compute for each player in each of 8 main subjects: lit/hist/sci/RMP/arts/geo/SS/other), so it may not be completely worthwhile to have it included, at least not in the first run of the program. As an initial trial, it may work better for an "overall" rating, with just dropping the "subject" part.

To elaborate on what I mean, I have included the four-step method below.

Step 1: Compute the "tossup points per tossup heard (TPTH)" in each subject
Add up all tossup points in each subject, for each individual and team, as follows:
-(10+X) points for a tossup, where X is computed as 10*(1-% of other rooms that have buzzed correctly at the point where the tossup is gotten; ties count for neither player, so if two out of five rooms buzz on the fifth word, those players each get 17.5 points instead of 20)
-(10+Y) points for a rebound, regardless of where it is answered, where Y is computed as 10*(1-% of rooms that have buzzed correctly at the end of the question)
- (-5) points for a neg

Under this scheme, any player who answers a tossup with a 90% conversion rate earns (at least) 11 points instead of 10, while any player who is the first to buzz on a question earns 20 points. Most tossups will be worth more than 10 points.

Divide this total by the number of questions heard in each subject to get TPTH.

Step 2: Convert your TPTH to your Performance Rating
-A rating of 100 corresponds to a TPTH of 20. This implies that the player not only answered every question in the subject, but answered each question before every other player in the tournament.
-A chance level corresponds to a TPTH of 1.25. Why is this chance level? There are 8 players (typically) in a room. You have a 1 in 8 chance of getting a tossup if everyone has just enough knowledge to get every question in a giveaway buzzer race. Therefore, you are expected to score 1.25 points on each buzzer race. I've also arbitrarily assigned this chance level to a performance rating around 40.
-A rating of 0 corresponds to a TPTH at or below 0.

Currently the formula I have looks like this: RATING = 100 - (1/30)*(e^(8-.4*TPTH)-1), where 0 points actually gives a slight positive rating.

-A player with a TPTH of "just" 10 still has a rating of 98. Really, anything over 3 yields a rating above 70. This does not seem too farfetched to me (a player with a rating of 70 would be answering roughly one out of every 3-4 questions in the subject, or somewhere between 3-10 ppg on that subject depending on the subject).

Step 3: Compute your Expected Rating
Take each player on your team's (initial) subject rating, convert to "expected TPTH" using the formula in step 2. New players will come in with a rating of 10 in each subject (roughly 0.25 expected TPTH in each subject), although if a good player does not already have a rating, it may be easier to subjectively assign that player an initial rating for the sake of not artificially diluting the field. Add the expected TPTH of your teammates. Also, average the "expected TPTH" of your opponents in that subject (if you play an opponent twice, count it twice).

You are "expected to score" X/(X+T+O) TPTH in your subject, where X is your expected TPTH, T is the expected TPTH of your teammates, and O is the average expected TPTH by your opponents. In other words, X+T+O TPTH are expected to be scored in games involving your team, and you are expected to score X of those TPTH.

The total TPTH can be found by averaging the TPTH scored by your team and the TPTH scored by your opponents (it seems easier and better, especially if you're comparing across mirror sites, to average things over the whole tournament). You should then have scored (Total TPTH)*(X/(X+T+O)) TPTH over the course of the tournament. Convert this to your Expected Rating using the formula in step 2.

Step 4: Compute your new rating
If you have no rating before the tournament, your new rating is equal to your Performance Rating.
If you have an old rating, your new rating is computed by (Old Rating) + l*q*(Performance Rating - Expected Rating), where l is an indicator of how long you've been playing (after the first n tournaments, l stabilizes) and q is an indicator of the quality and difficulty of a tournament (as quality and difficulty increase, so does q).

Advantages:
-can track players' improvement (across subjects, even) over an entire career
-shadow effect is now exponential (and if we work with ISRs, then it's segregated by subject) rather than linear
-clearly ranks players across subjects
-rewards both depth (by answering faster than other players) and breadth (by answering questions that go dead in other rooms)
-with ISRs, can be used for subject side tournaments as well since each subject rating is independent
-I think it adequately compensates for field and teammate strength; if anything, it may slightly overcompensate for playing against a stronger field (since your "expected rating" is going to be zilch if you're an average player on a strong team competing against other strong teams)
-can compute an Overall Player Rating using a weighted average of the eight subjects (for instance, weight by percent in the ACF distribution, or weight by best subjects) if such a statistic is desired
-could use ISRs as data for an "eHarmony-like matching program" for side tournaments, to be implemented in BTD

Disadvantages:
-May be quite complicated; if there is not an easy way to account for rebounds it may be easier to compute TPTH as ((10+Hap)*TU)-5*NEG)/(TUH)
-Yet to be tested, though it can't really be tested until the "who buzzed in when" statistic is finalized
-parameter estimation/discussion needed: what are ideal values of n, l, and q?
-Will probably end up requiring a lot of memory (eight subjects for each player? it may be easier to just use one subject, "overall," in an initial test)
Dwight Wynne
socalquizbowl.org
UC Irvine 2008-2013; UCLA 2004-2007; Capistrano Valley High School 2000-2003

"It's a competition, but it's not a sport. On a scale, if football is a 10, then rowing would be a two. One would be Quiz Bowl." --Matt Birk on rowing, SI On Campus, 10/21/03

"If you were my teammate, I would have tossed your ass out the door so fast you'd be emitting Cerenkov radiation, but I'm not classy like Dwight." --Jerry
User avatar
Kouign Amann
Forums Staff: Moderator
Posts: 1188
Joined: Sat Dec 06, 2008 12:44 am
Location: Jersey City, NJ

Re: Announcing BEeS: A Better Electronic Stats program

Post by Kouign Amann »

^^^ THIS ^^^
Aidan Mehigan
St. Anselm's Abbey School '12
Columbia University '16 | University of Oxford '17 | UPenn GSE '19
User avatar
The Goffman Prophecies
Quizbowl Detective Extraordinaire
Posts: 1611
Joined: Wed Mar 03, 2004 10:25 pm
Location: Wichita, KS

Re: Announcing BEeS: A Better Electronic Stats program

Post by The Goffman Prophecies »

Dwight Wynne you're my hero.
Dan Goff
HSQB sysadmin

Virginia Tech '13
South Carolina '15
and a couple other places
Not Thomas Dale HS

STAAATS
Oscar Girls Gone Wilde
Lulu
Posts: 9
Joined: Wed Oct 10, 2007 3:22 pm

Re: Announcing BEeS: A Better Electronic Stats program

Post by Oscar Girls Gone Wilde »

A few answers to a few questions:

I am using the Apache POI library. It is kind of a pain and requires several other Apache Java libraries, but it should enable me to open and extract the data from Office files without having office installed on the computer to begin with.

XML export is certainly a possibility. I haven't gotten particularly far in the way of stats exporting yet, so none of those plans are set in stone. I haven't actually done any XML work, so I haven't decided on anything for certain yet, but it doesn't really make sense to allow exporting of documents meant for the web and not allow some kind of XML export.

Dwight's dream statistic may be more complicated. Within the run of a single tournament, assuming each question is marked with its category in some fashion, calculating performance within a category is not particularly difficult. The issue comes up when you start trying to extend the data outside of the current tournament. This implies that internet access will be available to collect previous stats for each player, the player's name is entered exactly the same as it was in previous tournaments, and that every player in the room will have their stats available or some way of recognizing that they have no recorded stats. My current design works best with internet access as a lot of data can be uploaded and managed by the tournament management computer automatically so long as it has some means of communicating with the reader computers, but I haven't implemented any way to talk to say a web database of previous tournaments. At the moment, my statistic gathering abilities seem to be limited to atomic tournaments. I plan on generalizing things enough so that, hopefully, after the first run I might be able to do an update that handles statistics in a more general non-tournament specific way. The actual computation of such a statistic doesn't seem like it would be any kind of barrier and the memory required would be fairly negligible.

I haven't decided completely how protests will work, because I'm not sure how the stat tracking on protests usually works. The computer will do all of the statistics stuff itself, and the protest will be sent to the computer directing the tournament to be resolved. Functionally, a protest should would exactly as it would normally.
Sean Skaar
University of Minnesota
User avatar
cvdwightw
Auron
Posts: 3291
Joined: Tue May 13, 2003 12:46 am
Location: Southern CA
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by cvdwightw »

Oscar Girls Gone Wilde wrote:Dwight's dream statistic may be more complicated. Within the run of a single tournament, assuming each question is marked with its category in some fashion, calculating performance within a category is not particularly difficult. The issue comes up when you start trying to extend the data outside of the current tournament. This implies that internet access will be available to collect previous stats for each player, the player's name is entered exactly the same as it was in previous tournaments, and that every player in the room will have their stats available or some way of recognizing that they have no recorded stats. My current design works best with internet access as a lot of data can be uploaded and managed by the tournament management computer automatically so long as it has some means of communicating with the reader computers, but I haven't implemented any way to talk to say a web database of previous tournaments. At the moment, my statistic gathering abilities seem to be limited to atomic tournaments. I plan on generalizing things enough so that, hopefully, after the first run I might be able to do an update that handles statistics in a more general non-tournament specific way. The actual computation of such a statistic doesn't seem like it would be any kind of barrier and the memory required would be fairly negligible.
The obvious solution to this would be to allow manual entry of each of the eight ratings for each player when names are entered into the statistics program (presumably such a thing happens at the beginning of Round 1). An online database somewhere would consist of players and their corresponding set of eight ratings, which could then be transcribed by a coach or player the night before the tournament and either sent to the TD or given at registration/beginning of Round 1; this would maybe add five minutes with a competent moderator to the beginning of each tournament, which is more than saved in elimination of other inefficiencies. Then each player would have, instead of just a name, a nine-entry vector of [name LITrating HISTrating SCIrating RMPrating ARTSrating GEOrating SSrating OTHERrating]; "new" players without ratings would probably get default 10's across the board unless it's something like Matt Weiner playing his first tournament since this functionality was implemented, in which case we'd probably allow him to "guess" his own ratings. Once these vectors are entered, computing expected TPTH should be relatively straightforward, and if the computation of everything else isn't computationally or memory intensive, that's awesome. The problem with this is that people may not be diligent enough to find their updated ratings before a tournament (I have enough faith in the community that people won't intentionally misreport their own ratings), TDs may not be diligent enough to report new ratings to whoever's running the online database, and it looks to be awfully time-consuming for whoever's running that database. Thus, no generalization would be necessary (unless Sean generalizes the program to read and update its own ratings database, and whoever's working Round 1 could just match the player to the entry in the ratings database or create a new player to be entered into the database after the tournament).

That said, I'm ecstatic that many of my concerns are somewhat ill-founded, that at least two people other than myself think that this statistic could be cool and/or useful, and that there looks to be potential for this kind of statistic with the new program.
Dwight Wynne
socalquizbowl.org
UC Irvine 2008-2013; UCLA 2004-2007; Capistrano Valley High School 2000-2003

"It's a competition, but it's not a sport. On a scale, if football is a 10, then rowing would be a two. One would be Quiz Bowl." --Matt Birk on rowing, SI On Campus, 10/21/03

"If you were my teammate, I would have tossed your ass out the door so fast you'd be emitting Cerenkov radiation, but I'm not classy like Dwight." --Jerry
User avatar
grapesmoker
Sin
Posts: 6345
Joined: Sat Oct 25, 2003 5:23 pm
Location: NYC
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by grapesmoker »

Oscar Girls Gone Wilde wrote:XML export is certainly a possibility. I haven't gotten particularly far in the way of stats exporting yet, so none of those plans are set in stone. I haven't actually done any XML work, so I haven't decided on anything for certain yet, but it doesn't really make sense to allow exporting of documents meant for the web and not allow some kind of XML export.
XML exporting is easy; once you have isolated the question parts, it's trivial to write all the structure to a text file. I do this with a Perl script now.

Perhaps this is a good time to mention that I do have said Perl script available if you're interested in looking at it. It relies on the wv suite to work, so maybe it's not a good alternative to Apache POI (which I have no experience with), but it's pretty good at parsing questions once a conversion from word to clean Latex is performed.
Jerry Vinokurov
ex-LJHS, ex-Berkeley, ex-Brown, sorta-ex-CMU
presently: John Jay College Economics
code ape, loud voice, general nuissance
User avatar
SepiaOfficinalis
Lulu
Posts: 40
Joined: Thu Oct 09, 2008 3:46 pm

Re: Announcing BEeS: A Better Electronic Stats program

Post by SepiaOfficinalis »

Well, that sounds like pretty pure awesome. I think the only solution to the welter of suggestions is clearly a proprietary plugin format with some ruthless SDK, lol.
Grinnell Quizbowl 05-07
University of Minnesota Quizbowl 08-09.
T.....S...E...O
O.....O...R...L
M......D...H...M
User avatar
Sima Guang Hater
Auron
Posts: 1958
Joined: Mon Feb 05, 2007 1:43 pm
Location: Nashville, TN

Re: Announcing BEeS: A Better Electronic Stats program

Post by Sima Guang Hater »

Dwight, I like your idea, but I think you're reaching a little far with the descriptive power of the statistic. The idea of doing full parameter estimation for a "universal" rating seems chock full of holes; I do support the idea of rating each player by subject on a tournament-by-tournament basis in the manner that you've suggested.

Additional suggestions:
-Can the program pool data across multiple mirrors?
-Can you include a "simulate game" feature, where you can input two teams and a packet both of them played on and see who'd win? Or simulate an entire tournament with only two teams? (I think this might have been suggested above)
Eric Mukherjee, MD PhD
Brown 2009, Penn Med 2018
Instructor/Attending Physician/Postdoctoral Fellow, Vanderbilt University Medical Center
Coach, University School of Nashville

“The next generation will always surpass the previous one. It’s one of the never-ending cycles in life.”
Support the Stevens-Johnson Syndrome Foundation
User avatar
Mechanical Beasts
Banned Cheater
Posts: 5673
Joined: Thu Jun 08, 2006 10:50 pm

Re: Announcing BEeS: A Better Electronic Stats program

Post by Mechanical Beasts »

The Quest for the Historical Mukherjesus wrote: -Can you include a "simulate game" feature, where you can input two teams and a packet both of them played on and see who'd win? Or simulate an entire tournament with only two teams? (I think this might have been suggested above)
You'd have to do a lot of extrapolation to do this right. If this packet happens to have a tossup that both teams in the simulation were beaten to, you have zero buzz data to determine who would have buzzed first. You could extrapolate from, I don't know, performance in other categories or (with luck) on the bonus in that category, but that's wonky already. And that doesn't cover the problem that in the simulated game, teams might be getting very different bonuses (or simply more bonuses) than the bonuses they actually got.
Andrew Watkins
User avatar
The Goffman Prophecies
Quizbowl Detective Extraordinaire
Posts: 1611
Joined: Wed Mar 03, 2004 10:25 pm
Location: Wichita, KS

Re: Announcing BEeS: A Better Electronic Stats program

Post by The Goffman Prophecies »

The Quest for the Historical Mukherjesus wrote:-Can you include a "simulate game" feature, where you can input two teams and a packet both of them played on and see who'd win?
Something about this sounds very goldfish-y.
Dan Goff
HSQB sysadmin

Virginia Tech '13
South Carolina '15
and a couple other places
Not Thomas Dale HS

STAAATS
User avatar
cvdwightw
Auron
Posts: 3291
Joined: Tue May 13, 2003 12:46 am
Location: Southern CA
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by cvdwightw »

The Quest for the Historical Mukherjesus wrote:Dwight, I like your idea, but I think you're reaching a little far with the descriptive power of the statistic. The idea of doing full parameter estimation for a "universal" rating seems chock full of holes; I do support the idea of rating each player by subject on a tournament-by-tournament basis in the manner that you've suggested.
For the moment, let's assume my TPTH -> rating function of RATING = a - b*(e^(d-c*TPTH)-1) has correct, or at least feasible, values of a, b, c, and d (since RATING is an arbitrarily-defined exponential function of TPTH, and all calculations are done in TPTH before converting to an easy-to-read rating, I argue that my function is as good as any). Furthermore, it makes sense to me that the parameters n (number of tournaments before one's multiplier levels out), l (multiplier based on number of tournaments played), and q (multiplier based on difficulty and quality of the tournament) are at least constrained by the values of a, b, c, and d.

Parameter estimation of n, l, and q for a personal rating is not at all feasible at the beginning of this process (i.e. with no initial data). It's probably at least a year off, as one would need at least that much data to get a group of parameters that make sense (e.g. if BEeS is online by start of 2009-2010 season, that year's worth of tournaments would have to roughly predict people's performances at ACF Nats and CO). That said, if such a statistic were to start being calculated on a tournament-by-tournament basis once BEeS is functional, then I think that one could get at least a functional estimate of parameters through analysis of a full year or two's worth of data.

I think that ultimately, with enough data, we can perform parameter estimation for those three parameters. Once that is done (looking two or three years into the future now), we will have a reasonably accurate descriptive stat (Performance Rating = how well each player performed at the tournament itself) and predictive stat (Individualized Subject Rating = how well each player is expected to perform at the next tournament). Collecting at least a year or two's worth of tournament-by-tournament data would also give Sean and others ample time to tweak the program and figure out how to overcome the major problem with any statistic that carries over between tournaments, which is inaccurate reporting of old ratings (either that people misreport them, or that people enter names wrong or differently).
Dwight Wynne
socalquizbowl.org
UC Irvine 2008-2013; UCLA 2004-2007; Capistrano Valley High School 2000-2003

"It's a competition, but it's not a sport. On a scale, if football is a 10, then rowing would be a two. One would be Quiz Bowl." --Matt Birk on rowing, SI On Campus, 10/21/03

"If you were my teammate, I would have tossed your ass out the door so fast you'd be emitting Cerenkov radiation, but I'm not classy like Dwight." --Jerry
User avatar
Mike Bentley
Sin
Posts: 6461
Joined: Fri Mar 31, 2006 11:03 pm
Location: Bellevue, WA
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by Mike Bentley »

Some questions I had about the program:

How is player name entry going to work? Is there going to be some sort of matching algorithm that tries to figure out if "Mike B" is the same as "Bentley" and "Michael Bently"? Are you going to require that names be entered before the tournaments starts? If so, how are you going to handle last minute drops and adds from teams? Drops and adds during the day? The solution of using flash drives to copy this information on the morning of a tournament to all laptops at the tournament seems a bit time consuming.

On the issue of transfering data, how easy is it going to be to transfer data when there isn't wireless connectivity? Ideally most tournaments have at least some coverage, but pretty much every tournament I've played where people have used laptops to read have had at least a few of those laptops have difficulty accessing the network at points throughout the day. Additionally, I would imagine most high schools have no wireless available. How often is it expected that the tournament database gets information from the clients? If this needs to happen frequently, it's going to add a lot of time to tournaments. Even a one-off transfer of data from 20 or so computers to a central computer would require a lunch break, I would imagine.

What happens if the program crashes in the middle of the round? Is it going to be able to reliably store backups as a round progresses?

I think there were some other issues that people came up with when we talked about this in the IRC last week, but I don't think I logged those conversations. I'll make sure to post them if I think of them.
Mike Bentley
Treasurer, Partnership for Academic Competition Excellence
Adviser, Quizbowl Team at University of Washington
University of Maryland, Class of 2008
User avatar
theMoMA
Forums Staff: Administrator
Posts: 5993
Joined: Mon Oct 23, 2006 2:00 am

Re: Announcing BEeS: A Better Electronic Stats program

Post by theMoMA »

Thanks for posting, Mike. Sean and I have actually considered all of those issues and are trying to create a program that works around them. One of our biggest goals is to sniff out potential problems and eliminate them before they cause issues at actual tournaments.
Andrew Hart
Minnesota alum
User avatar
AlphaQuizBowler
Tidus
Posts: 695
Joined: Mon Dec 03, 2007 6:31 pm
Location: Alpharetta, GA

Re: Announcing BEeS: A Better Electronic Stats program

Post by AlphaQuizBowler »

So is this still going to be used at MUT this weekend?
William
Alpharetta High School '11
Harvard '15
User avatar
theMoMA
Forums Staff: Administrator
Posts: 5993
Joined: Mon Oct 23, 2006 2:00 am

Re: Announcing BEeS: A Better Electronic Stats program

Post by theMoMA »

AlphaQuizBowler wrote:So is this still going to be used at MUT this weekend?
Sean hasn't had time to do his science on this yet, so it will debut at a later date.
Andrew Hart
Minnesota alum
User avatar
theMoMA
Forums Staff: Administrator
Posts: 5993
Joined: Mon Oct 23, 2006 2:00 am

Re: Announcing BEeS: A Better Electronic Stats program

Post by theMoMA »

A status update: Sean has a lot of commitments this semester, and the current phase of the project is something that he needs to learn from scratch, so that's what's causing the current delay. We expect to have a working program during the summer (we'll certainly be seeking betatesters at some point), and we'll try to have the finished product available by the start of fall semester. Thanks for your patience, and I'll let you know more when I do.
Andrew Hart
Minnesota alum
User avatar
Auks Ran Ova
Forums Staff: Chief Administrator
Posts: 4295
Joined: Sun Apr 30, 2006 10:28 pm
Location: Minneapolis
Contact:

Re: Announcing BEeS: A Better Electronic Stats program

Post by Auks Ran Ova »

theMoMA wrote:beeeeeeeeeeeeeeeeeeeeeeeeetatesters
Rob Carson
University of Minnesota '11, MCTC '??, BHSU forever
Member, ACF
Member emeritus, PACE
Writer and Editor, NAQT
Locked