Skip to content

jwolle1/jeopardy_clue_dataset

Repository files navigation

jeopardy_clue_dataset

Jeopardy! Logo

This dataset contains Jeopardy! clues from Season 1 through Season 40 (July 2024). It does not contain every clue that has appeared on the show. The data source prefers not to be credited.

There are 523,118 clues in total. Most of them can be found in combined_season1-40.tsv.

There are individual files for each season (located in the seasons folder). These files are small enough that you should be able to open them with Microsoft Excel or Google Sheets.

Clues appearing in special matches outside the daily syndicated program are found in extra_matches.tsv. This file has 7,224 clues and they do not appear in the combined dataset.

There is a kids_teen_matches.tsv file which contains only clues that were featured in Kids and Teen Tournament matches. These clues are also in the combined dataset but this file is included for convenience.

I've done my best to clean the data and filter out clues that depend on images, video, or audio.


Column Information

Label Description
round 1 for Single Jeopardy, 2 for Double Jeopardy, or 3 for Final Jeopardy. (Note: These values are different in extra_matches.tsv to account for Triple Jeopardy.)
clue_value The clue's value on the board before any Daily Double wagering.
daily_double_value If the clue is a Daily Double, this column is the amount wagered. Otherwise it's zero.
category i.e. the top row of the board.
comments The host's comments about a category.
answer The prompt given to contestants.
question The correct response.
air_date The calendar date on which the episode first aired.
notes Misc. information about the clue, e.g. if it's from a special tournament match.

Other Data

A file with contestant scoring data can be found in the other_data folder. There are columns for each contestant's score after the Single, Double, and Final Jeopardy rounds. Most but not all episodes from the clue dataset are included.


FAQ

How do I download the dataset?

If you're new to Github and aren't sure what's going on, click the green Code button near the top of the page, then click Download ZIP.

What is a .TSV file?

The data is written in plain text and organized like a spreadsheet with a TAB character between each cell. You can open the files with applications like Microsoft Excel or Google Sheets.


All data is property of Jeopardy Productions, Inc. and protected under law. I am not affiliated with the show. Please don't use the data to make a public-facing web site, app, or any other product.

About

A dataset containing 523,000 Jeopardy! clues (1984–2024).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published