Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need a high quality corpus that is great for demos #29

Open
schue opened this issue Jul 17, 2017 · 1 comment
Open

Need a high quality corpus that is great for demos #29

schue opened this issue Jul 17, 2017 · 1 comment

Comments

@schue
Copy link
Contributor

schue commented Jul 17, 2017

No description provided.

@schue schue changed the title Need a high quality classification set that is great for demos Need a high quality corpus that is great for demos Jul 17, 2017
@biancadanforth
Copy link

biancadanforth commented Oct 4, 2017

From your FilterBubbler Demo screencast: A corpus is a collection of webpages that are tagged into topical groups.

In the screencast above, you add a corpus for "Boring or Awesome" and tag five pages as either Boring or Awesome. Then when applied to a Recipe, subsequent webpages are classified as either Boring or Awesome in the Matches tab based on the initial 5 classifications you made.

@schue

  • After the corpus is made, we would need to upload that corpus to the FilterBubbler server, right? (http://filterbubbler.org).
  • What kind of corpus would make for a great demo? My best guess:
    • Classifications that are generally agreed upon/not contentious
    • There can be more than two categories (ex: classifying movie webpages on IMDB by rating as G, PG, PG-13, R...)
    • A corpus that might have interesting side effects, like an unexpected classification of a webpage that could be good for discussion -- do you have a good example that you have come accross?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants