(1) I perceive a problem with a particular question, topic, or clue.
(2) I searched a packet database and got n hits.
(3) n is a problematic number.
(4) Therefore, sufficient evidence exists that the problem is as I say it is.
To make it slightly more concrete, I have seen a rise in claims such as the following:
- “I heard a tossup on an appropriate answer line that used <CLUE> towards the middle last weekend. I have only seen one instance of <CLUE> ever being used before, and it was earlier, at a harder tournament. Therefore, that clue seems really hard and probably should have been earlier.”
- “I believe that <TOPIC> is important and needs to come up more. I searched Quinterest and only saw it come up a single-digit number of times. Therefore, it is clear that <TOPIC> doesn’t come up enough relative to how important I believe it is.”
- “This question used a leadin clue from a text which has never been asked about in quizbowl before. That fact alone makes the current leadin too hard for this difficulty level.”
- “Eh, <TOPIC> isn’t actually too hard for this difficulty level. I searched a packet archive and it has come up n times before, so competent teams had a chance to learn it and it's probably fine.”
Database searches do not determine a topic’s difficulty
One can get a rough idea of how previous editors treated difficulty by searching a database, but it’s important to bring in outside reasoning and real-world exposure to a judgment of how hard something actually is. Some examples of cases where a database-only search fails to provide an accurate impression:
- A topic has been used twelve times before in high school quizbowl and it was too difficult every time. Reusing that topic in the same way would propagate error forward and result in low conversion rates.
- It is still the case that there are unasked or underasked possible topics which are genuinely easy despite not having come up much before, if at all. E.g.: a literary bestseller just broke onto the scene, or a scientific breakthrough has just hit the news. It would be impossible for a clue from Donna Tartt’s The Goldfinch to come up in packets before 2014, since that book was released in 2014. Nonetheless, such a clue was used in the 2014 NSC (and converted) due to the book’s rapid sales and apparent literary merit.
- A question writer who has actual experience with a field deliberately picks a never-before-used clue based on its repeated use and importance in classes they’ve taken, scholarship they’ve read, lab work they've done, a show they've seen, etc. Attempts to test outside-quizbowl learning through fresh early clues will appear artificially difficult if one’s only reference point is past packets, even in cases where such clues are very recognizable to those who know them.
- A topic was “trending” / in the midst of a “canon bubble” several years ago, coming up a lot in a short window, but has since faded from popularity and has appeared a lot less in sets from the past year or two. This makes it less likely that newer players will have internalized that topic from quizbowl exposure, even if a prior generation of players all had to know that topic cold to stay competitive.
As of now, database searches can’t accurately determine a topic’s frequency
As Joelle alluded to in the thread about GLBTQ topics in quizbowl (in which I agree with Vasa, Casey, Alex, and Colin, as I hope some of my past question writing shows), the current options for searchable online databases just aren’t very big. A huge number of sets aren’t on Quinterest yet, and there’s hardly any topic that appears more than ten times within that database. What's more, searches can often be thrown off by simple mishaps. Example: If a topic has multiple spellings, or is given in some packets in the original language but in others in English translation, that can easily alter the result numbers, even if one is careful.
When new data becomes available from tools such as Quinterest, it’s important to ensure that we don’t draw large mistaken conclusions from small unrepresentative samples, and it’s even more important to ensure that our perceptions of appropriate difficulty, or adequate representation, or current trends in “the canon” don’t get warped through the overuse of a skewed sample.
What’s more, even assuming a searchable database of every publicly-accessible question set were to exist, such a database would not include a single NAQT set. At least in high school quizbowl, NAQT is by far the dominant producer of questions, and has an outsize role in setting the difficulty and frequency of question topics through its gathering of conversion data. As long as NAQT sets remain for sale rather than publicly available, any sort of attempt to search ALL the things has to be marked with a small asterisk.
Ways to develop a keener sense for difficulty than just database search
(or: how to reinforce those database claims which are worthwhile and correct)
This post is not so secretly a roundabout way of saying the following: If a person wants to make useful constructive points in tournament discussion, they need to develop a sense of difficulty along multiple axes, not just from looking back at old questions sets. I have a couple of suggestions of things I’ve done towards that end.
I will go ahead and note, just for the sake of full disclosure: I do think that looking through old sets to see how clues were used or worded in past tournaments is one useful method that editors can use to make their own questions better. But until the time when databases are actually more comprehensive, I strongly recommend doing this by maintaining a large packet archive of every available set on one’s own hard drive, making sure to update it as new sets are publicly released.
Another good method for improving one’s judgments of difficulty is to take opportunities to moderate to real teams as a staffer at tournaments. Staffing at tournaments, and paying attention to where mid-level and weak teams buzz as you go, is a great way to get a sense of how well teams do on questions if you’re able to do so often. This is particularly important if you have written for or just edited a set and you have ultimate responsibility for how stuff actually plays out.
Lastly, it’s always a good idea to draw on genuine exposure to and investigation of academic material if one has it, or try to gain some such exposure if one doesn’t have it.
To conclude: The next time a question feels off to you, try to do more beyond just a quick search of old packets; it'll provide a more compelling argument and do more to help you develop a sense of appropriate difficulty for your own writing projects. Happy discussing, everyone.