Name	Name	Last commit message	Last commit date
Latest commit History 31 Commits
ChromeExtension	ChromeExtension
CourseraTranscriptScraper	CourseraTranscriptScraper
Documentation	Documentation
.gitignore	.gitignore
README.md	README.md
package.json	package.json

CS410 CourseProject (Team CAHJ) - Coursera Search with ChatGPT Extension

Project Overview

Requirements

This project is fairly straightforward with regards to requirements on the user's machine, but there are a few baselines that are required to be hit:

The project requires Google Chrome to work.
The project requires ChromeDriver, maintained by Chronium, to be installed in the root directory of the project in order to enable scraping (see Step 2 under Installation Instructions, below).
The project requires a working installation of Python to scrape new course content. The file requirements.txt includes the packages necessary for the script to run. If you plan to scrape new course content into the project ElasticSearch index, please ensure your Python environment satisfies these requirements. (TODO - Create requirements.txt file for Python packages)
As the extension is not deployed to the Google Chrome Web Store, it requires a local copy of the codebase on the user's computer (see Step 1 under Installation Instructions, below).

Installation Instructions

Installing the extension is quite simple; all you need to do is download the code from GitHub and then activate the extension in Chrome. A step-by-step guide for the above is below.:

Pull the code from GitHub to desiredDirectory using your shell:

cd desiredDirectory
git clone https://github.com/christianopperman/CS410_Fall2023_CourseProject_TeamCAHJ.git

Install the appropriate ChromeDriver for your computer's enviornment from this linke, unzip it, and move the Google Chrome for Testing application to the CS410__Fall2023_CourseProject_TeamCAHJ directory created in Step 1, above.
Open Google Chrome.
Go to the Extensions page on Google Chrome by following this link.
Activate Developer Mode by toggling the switch in the upper right corner labeled Developer mode.
Load the extension from the codebase pulled to your computer in Step 1 by clicking the Load unpacked button in the top left corner:
Select the desiredDirectory/CS410_Fall2023_CourseProject_TeamCAHJ/ChromeExtension directory in the popup and click Select
The extension should now be available to you in your Google Chrome Extensions list.

Usage Instructions

Coursera Transcript Scraper

As mentioned in Requirements above, in order to scrape your own Coursera course transcripts into the extension, you will need a working version of Python that satisfies the required packages outlined in the CourseraTranscriptScraper\requirements.txt file. Once you have that, scraping a new course into ElasticSearch is very easy:

Navigate to desiredDirectory/CS410_Fall2023_CourseProject_TeamCAHJ/CourseraTranscriptScraper in your shell
Call the course scraper script with, with the following command line arguments:

python scrape_coursera_course.py -c "course_url" -u "coursera_username" -p "coursera_password" [-e]

Required Arguments
- -c : The link to the landing page of the Coursera course you'd like to scrape
- -u : The username to your Coursera account which has access to the course you'd like to scrape
- -p : The password to your Coursera account which has access to the course you'd like to scrape
Optional Arguments:
- -e : A boolean flag. If included, the script will automatically push the scraped course transcriptions to ElasticSearch after saving them to disk. If not included, the transcriptions will be saved to disk but not pushed to ElasticSearch.
- -o : The output path to write the transcriptions to, if you would like to save the transcriptions to a specific filename.

Once you run the above command, a window will pop up and automatically log you into Coursera. It is likely that you will be required to complete a CAPTCHA.
Once you complete the CAPTCHA, return to your shell and press Enter, as prompted.
The script will begin scraping, as evidenced by the pop-up window navigating between video pages in the course and the Retrieved messages in the shell window.
The script will write any scraped transcriptions to the filepath specified by the -o command line argument, if present, and to subtitles.json if not.
If the -e flag was passed to the script, the script will automatically push the scraped course's transcriptions to ElasticSearch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CS410 CourseProject (Team CAHJ) - Coursera Search with ChatGPT Extension

Project Overview

Requirements

Installation Instructions

Usage Instructions

Coursera Transcript Scraper

Chrome Extension

About

Uh oh!

Releases

Packages

Uh oh!

CS410Assignments/CourseProject

Folders and files

Latest commit

History

Repository files navigation

CS410 CourseProject (Team CAHJ) - Coursera Search with ChatGPT Extension

Project Overview

Requirements

Installation Instructions

Usage Instructions

Coursera Transcript Scraper

Chrome Extension

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Packages