Announcing QuizDB: "Knowledge is Power"

The scariest thing of all is Protobowl

Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Sun Sep 10, 2017 12:51 pm

I'm very happy to announce the public release of QuizDB!

Image

QuizDB is a new searchable Quizbowl database and the designated modern successor to Quinterest. I'll answer the most important question first:

Why should I use QuizDB instead?

  • A rapidly improving feature set, limited to not just search but planned features like stats, a question reader, and minor improvements like multiple filter selection.
  • A continually growing question dataset, already with over 110K questions and quickly improving thanks to a dedicated portal, with all Quinterest admins already switched over.
  • Mobile-first design philosophy, meaning the site and all features will function and look good on any size device. (Except maybe a 2-inch spy phone.)

If you enjoy the site, please consider donating! According to my stats, I've spent a little over 24 hours directly just typing code, and all server hosting fees are paid for out of pocket right now (edit: I have since received initial funding from PACE. Thank you!). And of course, if you have any feedback please post in this thread or PM me through my forum account :)

Here's a quick list of the best features so far:
  • A compiled Resources guide, linking to all the best resources needed to get a new player hitting the ground running.
  • Multiple filter selection, meaning you can now do things as crazy as search for Science AND Literature at the same time! (I know, right?)
  • Native mobile app: if you're visiting QuizDB on certain mobile browsers, you can select the "Add to Home Screen" option to create a native app! When offline questions arrive, that means you can use QuizDB features without internet!
  • Quick search links: all questions have embedded links to search Google, Wikipedia, etc., so you won't even have to laboriously open a new tab!
  • In question error reporting, so that you(!) can help us improve question coverage!

If you'd like to see future features, you can visit this page. A drive-by list: statsmoxonadvancedsearchofflinequestionsstarredquestionspersonalsitepreferenceswoah!

Thank you for checking my site out!

<QuizDB is sponsored by PACE. Thank you for your support!>
Last edited by UlyssesInvictus on Sun Sep 10, 2017 6:34 pm, edited 1 time in total.
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Sun Sep 10, 2017 12:51 pm

Updates

  • 9/10/17: 1.0.1. Updated site meta-info.
  • 9/12/17: 1.0.2. Minor bug fixes.
  • 9/12/17: 1.1.0. Stats page! More minor bug fixes and feature improvements.
  • 9/17/17: 1.1.1. More encoding fixes and appearance improvements.
  • 9/23/17: 1.2.0. Offline/cached settings + more minor appearance improvements.
Last edited by UlyssesInvictus on Sat Sep 23, 2017 8:17 pm, edited 4 times in total.
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby vinteuil » Sun Sep 10, 2017 2:56 pm

Hey, just popping in here to say that I've been super excited by QuizDB, which looks and works a dream (as both a user and an admin), and Raynor has been fantastically responsive to suggestions as well as putting in a huge coding effort. It's great to know that the Quinterest DB is in a more sustainable and even better setting now.
Jacob Reed
Yale '19
East Chapel Hill '13
"...distant bayings from the musicological mafia"―Denis Stevens
User avatar
vinteuil
Yuna
 
Posts: 980
Joined: Sun Oct 23, 2011 12:31 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Sun Sep 10, 2017 5:48 pm

Thanks for the praise Jacob! I also want to thank Jacob here for helping me migrate all of Quinterest's questions over to QuizDB. In addition, I'd like to thank Rohit (Quinterest creator) and Jerry (QBDB creator) for their feedback while creating QuizDB, as well as all of the admins who were instrumental in testing during the beta period or who have already signed up since I've announced the site. Thank you!
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby Neggman » Sun Sep 10, 2017 6:08 pm

Hey Raynor this looks awesome!

I know nothing about coding, so I'm really not sure how simple this is, but the only suggestion I would have is to have the "wikipedia search" button search only for the actual answer line and not the entirety of the answer line (including prompts and whatnot). Having it search the entire thing leads to things like "Great Wall of China [or Wanli Changcheng; or Wan-li Ch'ang-ch'eng; prompt on "Mongolia," "Inner Mongolia," "China," "Northern China," or "Manchuria"]" in the wiki search box, which turns up nothing (I just tried). Obviously this can be resolved easily on the user's part who just has to delete all the extra stuff, but figured I would let you know.

Overall thanks so much this! (also does Venmo work for donations since I have money in venmo but not my account and am feeling especially lazy about transferring it back and waiting a few days?)
Emmett Laurie
East Brunswick '16
Rutgers University '20
Neggman
Lulu
 
Posts: 75
Joined: Wed May 06, 2015 2:09 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby tksaleija » Sun Sep 10, 2017 6:11 pm

A beautifully-working interface mixed with high question quality AND content is an excellent way to start off the year. Great job!!

On another note, I took a look at the future projects area and noticed the Moxon question reader was a pending idea. How high on your priority list is that update?
Aleija Rodriguez
Monroe County Middle College '19
Monroe County Community College '19
User avatar
tksaleija
Wakka
 
Posts: 111
Joined: Fri Jun 30, 2017 8:27 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Sun Sep 10, 2017 6:27 pm

Neggman wrote:Hey Raynor this looks awesome!

I know nothing about coding, so I'm really not sure how simple this is, but the only suggestion I would have is to have the "wikipedia search" button search only for the actual answer line and not the entirety of the answer line (including prompts and whatnot). Having it search the entire thing leads to things like "Great Wall of China [or Wanli Changcheng; or Wan-li Ch'ang-ch'eng; prompt on "Mongolia," "Inner Mongolia," "China," "Northern China," or "Manchuria"]" in the wiki search box, which turns up nothing (I just tried). Obviously this can be resolved easily on the user's part who just has to delete all the extra stuff, but figured I would let you know.

Overall thanks so much this! (also does Venmo work for donations since I have money in venmo but not my account and am feeling especially lazy about transferring it back and waiting a few days?)


Okay, first of all, thank you so much for being willing to donate! Yes, Venmo works great, and if you can't find me on Venmo, feel free to PM me so I can give you the details. (Just trying to avoid spreading the username on the public internet.) Anything is super helpful :)

With respect to the Wikipedia thing, this is something I've thought about! This falls under the general header of "figuring out what the answerline really is," which is both easy and hard. (I'd also been thinking about this for Moxon stuff, where it'll be useful for the parser to figure out if your answer is right.)

The easy way to do this is to just parse the specified answerline when there's formatting (e.g. bold, underline) present. This is very easy but depends on humans being good about formatting. I'll probably push an update with this within the week. (And as a side note, the best way for people to get better coverage here is to sign up as admins or report errors so admins can track down these questions :)

The hard way is something similar to how Protobowl does it, which is tokenization of answerlines, and what will be used as a fallback when the parser can't detect a manually specified answerline. Having taken a brief look at the code, Protobowl has a semi-sophisticated, but not super advanced method, for doing this, which I'll probably borrow as a good start--if Protobowl is okay with that--but will want to supersede with some kind of Natural Language Processing based approach later. This will be discussed more when Moxon is a thing :) (see next post)
Last edited by UlyssesInvictus on Sun Sep 10, 2017 6:32 pm, edited 1 time in total.
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Sun Sep 10, 2017 6:31 pm

tksaleija wrote:A beautifully-working interface mixed with high question quality AND content is an excellent way to start off the year. Great job!!

On another note, I took a look at the future projects area and noticed the Moxon question reader was a pending idea. How high on your priority list is that update?


It's moving up very quickly! I initially didn't really want to do it, but conversations with people who wanted it and the QANTA people about some cool ideas involving an actual AI (the name is a hint toward that :P) have increased my desire for making it.

Nevertheless, it'll still take a while, since I want to do it well. I'm planning the current release schedule:

- November 1: Basic, solo parsing/playing. Similar to what's on QuizBug right now.
- (sometime in between): Integrated study tools, also similar to what's on QuizBug right now.
- January 1: AI play!
- June 1: Multiplayer play.

It's all a little up in the air since there are other features I also want to work on at the same time and have to balance (and, you know, my day job). The reason the multiplayer stuff is coming out so much later than everything else is that (1) it's actually some of the web tech I'm least familiar with, so I'm saving time to research it (2) a lot of the initial backlash against Protobowl was because of bad community moderation, so I'm going to wait until I'm certain that won't be an issue.

As much as people have ragged on Protobowl at times, the code is actually quite good and advanced, so I'm going to try hard to meet their standards and produce something that's both fun to use and a sophisticated study tool.

So tl;dr: stay tuned!
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Sun Sep 10, 2017 9:30 pm

Just pushed a quick set of hotfixes for various issues I noticed. In the future I'll post these in the updates post (2nd from the top), unless they're major features, in which case I'll probably bring attention to them here.
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby wcheng » Mon Sep 11, 2017 8:09 am

One minor issue: it looks like some characters with diacritics aren't displaying properly right now. Besides that, though, this looks like an excellent resource and improvement on Quinterest!
Weijia Cheng
Centennial '15
Maryland '18 (Fall)
User avatar
wcheng
Lulu
 
Posts: 70
Joined: Mon May 26, 2014 12:02 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Mon Sep 11, 2017 9:34 am

The diacritics and other encoding issues are annoying since they actually come directly from the Quinterest database i.e. they're formatted that way in the original information. (I suspect that when people were pasting to Quinterest from PDFs, the copy/paste formatter just guessed on the character instead of actually knowing what it was.)

I'm writing fixes that silently replace known broken special characterd with what they should be, so post here if you see more! (I already know about the random Angstrom symbols and some f's being Chinese characters instead.)
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby tksaleija » Mon Sep 11, 2017 8:46 pm

Overall, I generally like the flashcard txt export but I've noticed that it always exports every question found under the search criteria, even if you only want a limited amount. By this, I mean that when trying to download the set of 10 hs history questions shown, it instead downloads all of the hs history questions in QuizDB. Not a huge issue, but I figured it important to mention in case someone wants to print off small packets.
Aleija Rodriguez
Monroe County Middle College '19
Monroe County Community College '19
User avatar
tksaleija
Wakka
 
Posts: 111
Joined: Fri Jun 30, 2017 8:27 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Mon Sep 11, 2017 8:49 pm

Oh, I did that on purpose because I figured there'd never be a case where someone wanted to restrict how many they were downloading. Hmm...

I'll think about a compromise. The reason I'm not just having the download size match the number of search results shown is then you'd have to load all the search results, and that can actually crash your webpage for queries that are too big. (Too many things on the page.) Until I get a paging system implemented, downloading all the text and then having the user appropriately pare it down might be the compromise.
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Tue Sep 12, 2017 2:11 am

About to push a big new feature: the Stats page! This page is pretty experimental with regards to most of the actual data inside of it, but IMO it's very pretty currently. (And should stay pretty even on mobile: let me know if it's not.)

I'll also be pushing multiple minor new features and bug fixes. The full changelog can be found by parsing the Git log.

As part of the push, the website may become sluggish or even go down for the next hour (around 2-3 AM Tuesday morning) or so. If it's still being weird after that, let me know.
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby tksaleija » Tue Sep 12, 2017 7:20 am

I am really impressed with the stats page as a whole. Not only does it give you an idea of how common a topic is, but also shows you what phrases come up most often (really good for narrowing down what to study). Overall, great stats page, flows well, and another great addition to an already good site.
Aleija Rodriguez
Monroe County Middle College '19
Monroe County Community College '19
User avatar
tksaleija
Wakka
 
Posts: 111
Joined: Fri Jun 30, 2017 8:27 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Tue Sep 12, 2017 10:25 am

Account creation is currently borked--will try to fix it at lunch.

Also, because of some javascript fooey (service workers, if anyone cares for the technical minutiae), you'll either have to hard refresh or use incognito to see updates to the web portal. Eventually your browser should adapt and load updates regularly for you, but I'm working on a better system.
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby Strongside » Fri Sep 22, 2017 4:14 pm

A few QuizDB items.

1. Some tournaments (specifically 2016 and 2017 ACF Nationals) have numerous questions with messed up characters. Various combinations including the letter f (fi, ft, ff) did that when they were entered on quinterest.

2. These tournaments are on quinterest and not QuizDB. The plan is for them to get added to QuizDB eventually. The reason is that these were added after the beta version of QuizDB was released.

2014 Chicago Open History
2010 Spring Offensive
2013 Urgent Call For Unity
2012 Chicago Open History

2008 Minnesota Open Literature
2007 Illinois Open Literature
2007 Chicago Open Literature

2017 Thought Monstrosity
2012 Questions Concerning Technology

2016 AVOGADROS NUMBER
2016 BONGOS
2014 Lederberg 2

3. It currently takes longer to enter a packet of questions on QuizDB than quinterest. I know Raynor is planning to make some changes to make entering questions faster.

4. Submitted errors have been corrected on a regular basis.
Brendan Byrne

Drake University, 2006-2008
University of Minnesota, 2008-2010
Strongside
Rikku
 
Posts: 473
Joined: Thu Apr 14, 2005 8:03 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Fri Sep 22, 2017 7:19 pm

Just a note on the encoding problem (i.e. the one that's especially egregious with "f" and Chinese chars appearing instead):

I've started using a strategy of quietly replacing these in the client side before they appear to users rather than bulk updating them repeatedly in the database. I'll probably collect all the cases I know about and then go through at the end of each month and fix them all.

So admins will still see them happening in the Admin portal, but they're going to be OK on the user end. It's still important to report them as errors, though, in case it's a case I haven't seen before.

(There are a lot of permutations of ligatures and Chinese, it seems.)
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Sat Sep 23, 2017 8:16 pm

Just pushed a new update to QuizDB that should (hopefully, it was somewhat hard to test in development) deal with the issue of old code not being refreshed when updates are released...starting with updates after this one.

For now, if you're having issues with the page loading, you may need to do a hard refresh (shift-f5 in most browsers).

It also adds the beginning idea of locally cached settings, which you can find from the sidebar.

I'm going to move on to doing some parsing work on getting more questions in the DB, and then spending considerable time finishing editing for HFT, so this will probably be the last public website update for a while.

After that I'll get started on the reader everyone wants so bad...
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby dablasian » Sun Sep 24, 2017 10:47 am

One small bug I found is that the two graphs that are displayed on the stats page now are displayed in a vertical layout, instead of a horizontal layout. This causes the graphs to overlap with the keyphrases section of the stats page. The below image shows this small bug.

QuizDB Bug.png
(77.14 KiB) Not downloaded yet
Adithyan Sujithkumar
Oak Ridge '18
dablasian
Lulu
 
Posts: 20
Joined: Tue Sep 05, 2017 11:05 am

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Sun Sep 24, 2017 6:51 pm

whooops, was testing out some appearance changes and totally forgot to revert them

i'll push a fix later tonight

thanks for reporting it!
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby Mike Bentley » Mon Sep 25, 2017 9:04 pm

I wrote a little script that takes the json output from QuizDB bonuses and imports them into Anki (so long as you use the Anki Json Importer plugin available at https://github.com/fxxing/anki-json-importer). If anyone's interested i can get this converted into something consumable.
Mike Bentley
Treasurer, Partnership for Academic Competition Excellence
Adviser, Quizbowl Team at University of Washington
University of Maryland, Class of 2008
User avatar
Mike Bentley
Auron
 
Posts: 5311
Joined: Fri Mar 31, 2006 11:03 pm
Location: Bellevue, WA

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Mon Sep 25, 2017 11:13 pm

Mike Bentley wrote:I wrote a little script that takes the json output from QuizDB bonuses and imports them into Anki (so long as you use the Anki Json Importer plugin available at https://github.com/fxxing/anki-json-importer). If anyone's interested i can get this converted into something consumable.


I'd love to see this, if only so I can know how to better tool the code to fit people's use cases! So nothing consumable necessary, but a link to the code would be cool.
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby dablasian » Tue Sep 26, 2017 1:11 pm

Do you know if the CSV export update will occur soon or should it be expected post-HFT editing?
Adithyan Sujithkumar
Oak Ridge '18
dablasian
Lulu
 
Posts: 20
Joined: Tue Sep 05, 2017 11:05 am

Re: Announcing QuizDB: "Knowledge is Power"

Postby Toystory (bull) » Tue Sep 26, 2017 2:31 pm

Mike Bentley wrote:I wrote a little script that takes the json output from QuizDB bonuses and imports them into Anki (so long as you use the Anki Json Importer plugin available at https://github.com/fxxing/anki-json-importer). If anyone's interested i can get this converted into something consumable.


That would be an awesome thing to use.
Rishik Hombal
Hoover HS 2014-18
Toystory (bull)
Lulu
 
Posts: 30
Joined: Sun Feb 12, 2017 11:27 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby Progcon » Tue Sep 26, 2017 4:26 pm

This is a great database and I love the statistics. That was a great idea.

One bug I have noticed is that when you add a subcategory to a category list, and then you click random X questions, you only get question in that sub category you put. For example, if you put Literature, Fine Arts as your categories, then added the "Visual" sub category, all of the resulting questions from, say, 10 random questions would be "Fine Arts Visual".

A minor issue: it's still a great website already!
Harris Bunker
Grosse Pointe North High School '15
Michigan State University 2015-
MSU Academic Competition Club President 2016-
User avatar
Progcon
Rikku
 
Posts: 300
Joined: Fri Dec 20, 2013 8:24 pm
Location: Zürich, CH

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Tue Sep 26, 2017 4:39 pm

Progcon wrote:This is a great database and I love the statistics. That was a great idea.

One bug I have noticed is that when you add a subcategory to a category list, and then you click random X questions, you only get question in that sub category you put. For example, if you put Literature, Fine Arts as your categories, then added the "Visual" sub category, all of the resulting questions from, say, 10 random questions would be "Fine Arts Visual".

A minor issue: it's still a great website already!


That's by design, currently: the subcategory always supersedes the category. I can have the category supersede the subcategory, but I want to be a little mean and just enforce user obedience for the sake of simple code. Against every modern design principle, I know, but at least I'm honest about it >:)

Although, I'll probably come up with some sort of better solution so in this particular case you don't have to list every literature subcategory just to scope the fine arts subcategory

edit: fixed this to work the way you expect it to; expect a release at the end of the week (on that note, people can in general expect releases to be during weekends)
Last edited by UlyssesInvictus on Wed Sep 27, 2017 1:36 am, edited 1 time in total.
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Tue Sep 26, 2017 4:42 pm

dablasian wrote:Do you know if the CSV export update will occur soon or should it be expected post-HFT editing?


I might actually just scrap that feature: most JSON can be easily converted to CSV these days, so I might just push the responsibility for doing the conversion onto the user.

I'll probably just leave it as a WIP feature on the website indefinitely until the mood to do it hits me some day, and it'll be the in-joke of planned QuizDB features :shrug:
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby Mike Bentley » Sun Oct 01, 2017 6:34 pm

I haven't had a chance to make this a web app, but if you're on Windows you can use this to generate Anki cards from QuizDB exports: http://www.doc-ent.com/quizdb/publish.htm

Instructions are in the program.
Mike Bentley
Treasurer, Partnership for Academic Competition Excellence
Adviser, Quizbowl Team at University of Washington
University of Maryland, Class of 2008
User avatar
Mike Bentley
Auron
 
Posts: 5311
Joined: Fri Mar 31, 2006 11:03 pm
Location: Bellevue, WA

Re: Announcing QuizDB: "Knowledge is Power"

Postby acz13 » Sun Oct 08, 2017 2:48 pm

PDF parsing is still screwed up? Almost all of the questions have weird formatting glitches and stuff...
A friendly Noivern

Albert Zhang

State College 2016-present (club member. novice ish.)
User avatar
acz13
Lulu
 
Posts: 7
Joined: Tue Oct 18, 2016 5:52 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Sun Oct 08, 2017 5:05 pm

acz13 wrote:PDF parsing is still screwed up? Almost all of the questions have weird formatting glitches and stuff...


The parsing happened once, back when Quinterest was originally filled with questions by admins porting questions from the PDFs (with copy-paste) to Quinterest.

QuizDB just uses that data -- it doesn't reparse the data from PDFs every time you search.

The best way to get at all these encoding errors is to just report them one-by-one, and we can gradually go through and figure out what characters they're supposed to be.

Once I finish writing my parser, I can also potentially just re-parse all those questions again, but the data is fairly acceptable already, so I don't really want to go through the risky process of reparsing everything and potentially introducing new errors.
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Mon Oct 09, 2017 7:36 pm

Happy to announce that Augur, QuizDB's in-house packet parser is now complete and available here: https://github.com/UlyssesInvictus/QuizDB/tree/master/lib/parser/augur*

*With perhaps some shoddy and incomplete documentation, but I don't expect anyone to really need to use it soon.

Augur is capable of parsing the most common quizbowl packet formats, and gives you the ability to easily modify its parsing recognizers to handle other formats as well. As noted in other posts (here, for example), it doesn't accept PDFs due to the inherent difficulty in parsing PDFs into machine readable code, but it does great with Word files which most writers are using already. (If you have other formats, like TeX, please let me know and we can talk about how to convert them, if we haven't already.)

Augur consists of three basic moving parts:
  • The packet converter--this turns your input into machine readable format (HTML, basically)
  • The packet parser--this figures out the "QB Markup Language" in the packet and turns it into internal questions
  • The categorizer.

The categorizer did surprisingly well, achieving results of 90%+ and 80%+ for categories and subcategories respectively during training, which I believe is actually higher due to likely mislabeled test data. It's very slow right now, because I train it on the entire QuizDB corpus, but I have a feeling it could use a much smaller training set and still do very well. I see a lot of potential here, as the categorizer basically works on any question text that it's fed. (Really the only annoying part is the high training time, but it's only about ~10m per training session and then you can fit as much data as you want at once.)

On a related note, there will be some slowness in the site tonight as I periodically bulk upload as many Word packets I can find. I've already uploaded 2017 NASAT and 2017 Prison Bowl at the time of this post, and I'll try to upload at least 4 or 5 others tonight. There's still some slowness because I haven't fully assembly line-ified the process--I wanted to insert some human oversight during the initial trial runs--but it's already way faster than doing it by hand, and about as accurate, if not more so. (No more encoding errors! Yippee!)

One last reminder to upload your packets as Word files on the archives if you haven't already, and contact me if you have any questions / see any arising difficulties!
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby acz13 » Tue Oct 10, 2017 7:30 pm

Oh, and also for some reason all the end power markers look like this:

(*))

for me... I don't think that's how it's supposed to look?
A friendly Noivern

Albert Zhang

State College 2016-present (club member. novice ish.)
User avatar
acz13
Lulu
 
Posts: 7
Joined: Tue Oct 18, 2016 5:52 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Wed Oct 11, 2017 9:13 pm

acz13 wrote:Oh, and also for some reason all the end power markers look like this:

(*))

for me... I don't think that's how it's supposed to look?


Yeah...back when I first imported Quinterest's questions, I ran a batch job to make the power sections bold. Except I messed up the script and added an extra parens everywhere. It takes a good amount of time to run for every question, so once I saw the mistake, I sort of just went :capybara: it and decided to leave it for the future if it really affects people's question reading.
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby acz13 » Fri Oct 13, 2017 8:37 pm

something like:

Code: Select all
UPDATE tossups SET formatted_text = replace(formatted_text, "(*)</b>)", "(*)</b>");


? Well, with correct field and table names of course :razz:
A friendly Noivern

Albert Zhang

State College 2016-present (club member. novice ish.)
User avatar
acz13
Lulu
 
Posts: 7
Joined: Tue Oct 18, 2016 5:52 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Sat Oct 14, 2017 2:15 pm

acz13 wrote:something like:

Code: Select all
UPDATE tossups SET formatted_text = replace(formatted_text, "(*)</b>)", "(*)</b>");


? Well, with correct field and table names of course :razz:


yeah, i'll do it eventually, i'm just mucho lazy and busy with other things

<on that note, for example!>

In the past two weeks, I've uploaded fully formatted tossups and bonuses for the following tournaments:

  • 2015 GSAC
  • 2015 Prison Bowl
  • 2016 HFT XI
  • 2015 HFT X
  • LIST VI
  • 2015 BISB
  • 2015 Maryland Fall
  • 2014 LIST
  • 2017 Prison Bowl
  • 2017 NASAT
  • 2016 Penn Bowl
  • 2013 LIST III

which either had not been uploaded previously, did not have bonuses, or did not have proper text formatting.

I'll continue uploading the occasional tournament in my free time. More college tournaments is probably up next.

Again, please upload Word packets for existing tournaments in the database if you have them!
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby dablasian » Sun Oct 15, 2017 2:25 pm

When I download a .json file from QuizDB, only the first 300 questions are downloaded. For example, instead of downloading all of the 460 tossups on Classical History that are college difficulty, only the first 300 tossups are "written" onto the .json file. Is there a way to download all of the questions at once or am I just being dumb?
Adithyan Sujithkumar
Oak Ridge '18
dablasian
Lulu
 
Posts: 20
Joined: Tue Sep 05, 2017 11:05 am

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Sun Oct 15, 2017 9:09 pm

dablasian wrote:When I download a .json file from QuizDB, only the first 300 questions are downloaded. For example, instead of downloading all of the 460 tossups on Classical History that are college difficulty, only the first 300 tossups are "written" onto the .json file. Is there a way to download all of the questions at once or am I just being dumb?


Sigh, I was hoping this issue wouldn't come up.

The issue is that huge downloads just simply crash QuizDB's servers, from simple lack of memory power. More $$$ = more servers = solution, but you understand why that's not always possible. I could improve the code slightly as well, but it would be squeezing water from a rock.

Thus the totally silent limit on how much you can download at once. My recommendation for now is to just download smaller files -- maybe download one file of all the easy college, then regular college, then hard college.

300 is a pretty conservative limit right now, partially to see if anyone actually ran into the issue (and you have), so I'll increase it to 800 later. It's really only catastrophic in the 10s of thousands, but it still starts to nudge the servers in the thousands.

Sorry about that!
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby dablasian » Tue Oct 24, 2017 6:49 pm

UlyssesInvictus wrote:
dablasian wrote:When I download a .json file from QuizDB, only the first 300 questions are downloaded. For example, instead of downloading all of the 460 tossups on Classical History that are college difficulty, only the first 300 tossups are "written" onto the .json file. Is there a way to download all of the questions at once or am I just being dumb?


Sigh, I was hoping this issue wouldn't come up.

The issue is that huge downloads just simply crash QuizDB's servers, from simple lack of memory power. More $$$ = more servers = solution, but you understand why that's not always possible. I could improve the code slightly as well, but it would be squeezing water from a rock.

Thus the totally silent limit on how much you can download at once. My recommendation for now is to just download smaller files -- maybe download one file of all the easy college, then regular college, then hard college.

300 is a pretty conservative limit right now, partially to see if anyone actually ran into the issue (and you have), so I'll increase it to 800 later. It's really only catastrophic in the 10s of thousands, but it still starts to nudge the servers in the thousands.

Sorry about that!


I'm sorry if I seem impatient but is there a time frame for the question download limit to be increased?

Thanks!
Adithyan Sujithkumar
Oak Ridge '18
dablasian
Lulu
 
Posts: 20
Joined: Tue Sep 05, 2017 11:05 am

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Tue Oct 24, 2017 10:20 pm

Probably this weekend. I try to keep to a discipline of only pushing website updates on the weekend so I have the time to fix things if I mess up tremendously (basically the opposite of in real software offices), and since this isn't an urgent fix--I think downloading difficulty by difficulty or tournament by tournament should still work well, right?--I'm going to try and stick to that discipline.
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby dablasian » Thu Oct 26, 2017 10:14 pm

UlyssesInvictus wrote:Probably this weekend. I try to keep to a discipline of only pushing website updates on the weekend so I have the time to fix things if I mess up tremendously (basically the opposite of in real software offices), and since this isn't an urgent fix--I think downloading difficulty by difficulty or tournament by tournament should still work well, right?--I'm going to try and stick to that discipline.


Yes that does still work (downloading with filtering by tournament/difficulty). I actually forgot about filtering by tournament, so thanks for reminding me. It's much easier to download larger sets of questions when separating the questions tournament by tournament.
Adithyan Sujithkumar
Oak Ridge '18
dablasian
Lulu
 
Posts: 20
Joined: Tue Sep 05, 2017 11:05 am

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Wed Nov 08, 2017 12:19 pm

I haven't been able to do any substantive QuizDB work as I put out the final touches on HFT, but I just uploaded the training data for Augur here: https://s3.amazonaws.com/quizdb-public/quizdb_classifier_training_data.json.

It mainly contains a simplified JSON grouping of all questions with categories and subcategories, so that you don't have to pull it yourself.

Hopefully it helps someone!
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby Mike Bentley » Thu Nov 09, 2017 10:55 am

Mike Bentley wrote:I haven't had a chance to make this a web app, but if you're on Windows you can use this to generate Anki cards from QuizDB exports: http://www.doc-ent.com/quizdb/publish.htm

Instructions are in the program.


I uploaded a new version of the program which outputs to tsv files instead, which Anki should have an easier time importing.
Mike Bentley
Treasurer, Partnership for Academic Competition Excellence
Adviser, Quizbowl Team at University of Washington
University of Maryland, Class of 2008
User avatar
Mike Bentley
Auron
 
Posts: 5311
Joined: Fri Mar 31, 2006 11:03 pm
Location: Bellevue, WA

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Sat Nov 11, 2017 1:23 am

Now that HFT is off to the presses, and to celebrate the tournament tomorrow, I've started uploading packets again. Here are the most recent additions, which either didn't exist before or are newly complete and formatted:

- 2015 HFT X (I forgot a round the last time...)
- 2015 MUT
- 2016 MUT
- 2013 BHSAT
- 2017 BHSAT
- 2017 Chicago Open
- 2013 JAMES (including the trash this time!)
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby dablasian » Sat Nov 11, 2017 12:25 pm

Are there plans to add a feature to filter questions on QuizDB by when they were added to the database?

EDIT: Grammar is hard.
Adithyan Sujithkumar
Oak Ridge '18
dablasian
Lulu
 
Posts: 20
Joined: Tue Sep 05, 2017 11:05 am

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Sat Nov 11, 2017 2:26 pm

That should be available on the admin site, but I probably will never add that to the main site since it's not super relevant most of the time
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm

Re: Announcing QuizDB: "Knowledge is Power"

Postby UlyssesInvictus » Tue Nov 21, 2017 6:38 pm

Minor update on Moxon (the question reader): I got lazy because of a combination of HFT, WAO II, and the holidays, so I'm pushing this back. A conservative estimate for when something usable should be done is Jan. 10, because I won't have work over the end-of-year holidays and can just work on it from start to finish. (Expect something, let's say, about as good as Quizbug currently.)
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
Ex-Writer for NAQT
User avatar
UlyssesInvictus
Tidus
 
Posts: 507
Joined: Thu Feb 10, 2011 7:38 pm


Return to Databases and Quizbowl Software

Who is online

Users browsing this forum: No registered users and 3 guests