GradCafe Scraper

GradCafe Scraper is a command-line tool written in Python that automatically extracts admission decision data from The GradCafe. It uses requests and BeautifulSoup to navigate and parse result pages, and saves the collected data into a structured CSV file.

Main Features

Parametric search: filters by institution, program, and degree type.
Automatic pagination: navigates through all result pages and handles pagination dynamically.
Detailed extraction: extracts school, program, study level, submission date, decision, GPA and GRE scores (V, Q, AW), plus any comments.
Interactive status bar: displays a colored progress bar with percentage, current page, and number of entries collected.
CSV output: saves data into a CSV file automatically named based on parameters and execution date.

Requirements

Python 3.7 or higher
Internet connection

The program requires the following Python libraries (installable via requirements.txt):

requests>=2.0
beautifulsoup4>=4.0
pyfiglet>=0.8
colorama>=0.4

Installation

Clone this repository:

git clone https://github.com/your-username/gradcafe-scraper.git
cd gradcafe-scraper

Create and activate a virtual environment (optional but recommended):

python -m venv venv
# Windows
venv\Scripts\activate
# macOS / Linux
source venv/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```

Usage

Run the script from terminal:

python gradcafe_scraper.py

Follow the interactive prompts:

Enter the name of the institution (leave blank for no filter).
Enter the program name (leave blank for no filter).
Select the degree type from the menu.

The tool will display a summary of your input and start downloading the result pages; you'll see a progress bar. When done, you'll find the generated CSV in:

CSVs/GradCafe/<institution>_<program>_<degree>_<DDMMYYYY>.csv

CSV Structure

The CSV file will contain the following columns:

Column	Description
School	Institution name
Program	Program name
Level	Level of study (e.g., PhD, Masters, etc.)
Added on	Submission date
Decision	Decision outcome (first word only)
GPA	Reported GPA
GRE V	GRE Verbal score
GRE Q	GRE Quantitative score
GRE AW	GRE Analytical Writing score
Comment	Any additional comment

Example CSV row

MIT,Computer Science,PhD,March 15,2024,Accepted,3.9,160,170,5.0,"Excited to join!"

Sample Dataset

In the csv_output folder, you’ll find two sample datasets previously extracted, based on master’s programs of personal interest. These CSV files can be used as a reference for the data structure or to test analyses without having to run the scraper right away.

Customization

You can adjust the sleep interval (currently 0.5 s) to balance speed and server load.
The user agent is configured in the headers variable; feel free to change it if needed.

Contributions

Contributions are welcome! Feel free to open a pull request or report an issue for bugs or feature suggestions.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Last updated: July 2, 2025

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
output_csv		output_csv
.gitattributes		.gitattributes
README.md		README.md
requirements.txt		requirements.txt
scraper_gradcafe.py		scraper_gradcafe.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GradCafe Scraper

Main Features

Requirements

Installation

Usage

CSV Structure

Example CSV row

Sample Dataset

Customization

Contributions

License

About

Uh oh!

Languages

Cruciat/GradCafe_Scraper

Folders and files

Latest commit

History

Repository files navigation

GradCafe Scraper

Main Features

Requirements

Installation

Usage

CSV Structure

Example CSV row

Sample Dataset

Customization

Contributions

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages