Skip to content

SheepTester/ucsd-sunset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SunSET logo

SunSET for UCSD

A crowdsourced dataset of grade distributions submitted by students from their academic histories, to replace CAPEs' "Grades Students Received," which SETs no longer publish.

Usage

The crowdsourced dataset is available to view on the live website. Click on the "Contribute" button for instructions on how to contribute your classes' grade distributions from your Academic History.

The raw dataset is also available as a spreadsheet. If you want to load it into a program, you can format it as a CSV or TSV file to perform your own data analysis and visualization.

Setup and design

Background

For the past decade, UCSD has used CAPEs to allow students to evaluate their courses and professors sometime before finals. The results of these evaluations were published alongside the actual grade distribution received by the class (shown alongside a distribution of students' expected grade when filling out the evaluation). Apparently, it started out as a student project but became adopted by the university to evaluate instructors.

UCSD recently switched to a new system for evaluations, SETs, which has a new set of questions better tailored for evaluating instructors rather than aiding future students. Among these changes, "Would you recommend this instructor/class?" and the distribution of grades received were removed.

Fortunately, academic history still publishes grade distributions, but only for the sections you were enrolled in. It also is more fine grained, including signed letter grades (A+ vs A-) and other letters (P, NP, W). Ideally, we could crowdsource these grade distributions from all sections by having students submit them voluntarily into a public dataset for anyone to look through.

Goals

The main purpose of this project is data collection, rather than presentation. I assume that other students can make better analyses of the collected data. Therefore, the goals are:

  • Allow students to easily contribute grade distribution data.
  • Publish the collected data for any student to use.
  • Make the collected data immediately presentable to help spread the word.
  • Ensure the project can continue to work without active maintenance (i.e. does not require keeping a server running).
  • Make attempts to preserve student privacy, and make any privacy concerns clear.

And non-goals:

  • Clean up data from malicious contributors. This will be left to data consumers to filter out and process the contributed data.

  • Keep contributors anonymous. We wouldn't lose too much from omitting hashed student emails from the dataset, but it might help with processing data (e.g. filtering out malicious contributors or removing duplicate submissions), and I don't think the list of courses taken by a student is too sensitive.

  • Asking if students recommend a professor or class, at least for now. r/UCSD and RMP are probably good enough, but it wouldn't be too much work to ask contributors for their recommendations.

    I would like to add this to the project because it would help with sorting classes by recommended professor. However, a major issue is that contributions aren't particularly anonymous, and professor recommendations are more sensitive than the list of courses taken.

SunSET itself does not focus on good data presentation or cleaning up data. I'm hoping that other students can make a project like Seascape to visualize the data better. Rather, the website is just a starting point so there are immediately results to show to get more students to contribute their data.

Design

Here is an overview of the user flow for SunSET:

User flow diagram

And here's an architectural diagram:

Architectural diagram

  1. Bookmarklet: To collect data, students run a bookmarklet on Academic History. Bookmarklets are the easiest way for students to run our data collection script, as opposed to a browser extension or userscript.

    • Weakness: Bookmarklets don't work on mobile, but I don't think there's any alternative that's as user-friendly.

    • Weakness: Academic History doesn't seem to provide the UCSD email anywhere, so I can't look up the contributor's prior submissions on the spreadsheet. If they submit multiple times, there will be multiple duplicate rows in the spreadsheet.

  2. Google Form: To store the data, students submit the JSON output of the bookmarklet into a Google Form, which requires a UCSD Google account to access. There is a single question for students to paste the data in, and it collects their email to discourage students from misusing the service.

    The Google Form has a Google Apps Script connected that adds the student-submitted grade distributions into a spreadsheet. This spreadsheet is published to the web, so anyone can download it as a CSV or TSV file.

    • Weakness: The Apps Script might have race conditions as it's adding rows from two submissions at the same time. I haven't tested it, so hopefully it doesn't overwrite rows. Rows per submission may not be contiguous, but that's fine.
  3. Website: The spreadsheet is accessed directly from the browser rendering the page, and it's parsed into data for a React app to display.

    • Weakness: The spreadsheet might get very large because duplicate rows aren't removed, and this cost is passed onto end users downloading massive TSVs on every page load. At 18 contributions, the spreadsheet already takes 1.67 seconds to load 19.9 kB of data (but I think 1.4 seconds of this is a fixed cost from Google being Google).

Caveats

Our academic history does not include the section codes that students are enrolled in, but it only shows the grade distribution for the enrolled section. This means that there's no way to tell what section a grade distribution belongs to, and how to distinguish between changes to a section's grades (e.g. blank grades being resolved) from other sections and faked data. This problem is left as an exercise for data consumers.

The SunSET website currently naïvely considers each unique grade distribution as its own section, so it's listed separately. Duplicate grade distributions are combined. If there's a grade change, it's listed as a separate grade distribution. Most grade changes are due to blank grades being resolved, so it could be possible to figure out if one grade distribution became another, but this website doesn't do that yet.

Development

This project requires Node. If you want to use Yarn, you can install it with npm install -g yarn.

# Start a development server
$ yarn build:bookmarklet
$ yarn dev
# Build into dist/
$ yarn build

About

🌅 SunSET for UCSD: A crowd-sourced dataset of grade distributions submitted by students from their academic histories, to replace CAPEs' "Grades Students Received," which SETs no longer publish.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors