This repo contains useful scripts I developed over the years (since 2015!) to support student's submissions for homeworks and projects. They include tools to manage Git and GitHub repositories, Google Workspaces (drive, spreadsheets), and shell/filesystem scripts.
They are almost all Python-based script, using two API libraries:
- GitPython (scripts
git_-_xxx.py): to perform GIT operations on the filesystem (e.g., cloning, commiting and pushing changes, reverting) - PyGitHub (scripts
gh_xxx.py): to perform API calls to GitHub REST API. - Various libraries for accessing Google resorces (spreadsheets, Drive, etc.). These scripts are named
gg_xxx.py
Feel free to use them as desired. No guarantees and I am sure they will have bugs or out-dated code! Open an issue or PR as needed. 😉
- Teaching scripts
- Setup
- GitHub Scripts
- Manual testing at development
gh_classroom_collect.py: collect repos from a GH Organizationsgh_authors_collect.py: extract commits per authorgh_create_wiki.py: push Wiki template to list of reposgh_member_bulk_team.py: add/delete GH username to GH teamsgh_pr_merge.py: bulk merge of PRsgh_pr_feedback_create.py: create Feedback PRsgh_pr_check_merged_forced.py: check for merged PR and forced pushesgh_pr_post_result.py: push comments to repo's PRsgh_pr_post_comment.py: push a message to PRsgh_workflow.py: run automarking workflowgh_commit_after.py: get commits after a dategh_tag_after.py: get repos with a given task after a dateghc_build_reporter.py: build YAML classroom reportergh_user_access.py: get repos and accesses of orggh_issue_labels.py: get/update issue labels in a GH repogh_refresh_invite.sh: Refresh pending repo invitations
- Git Tools
- Google Scripts
- Useful shell commands
- Contributors
To install all requirements run this in your Python virtual environment:
$ pip install -r requirements.txtThe libraries used are:
- GitHub REST documentation: https://docs.github.com/en/rest
- PyGithub: https://github.com/PyGithub/PyGithub
- gitpython: http://www.legendu.net/misc/blog/hands-on-GitPython/
- gh API CLI tool: https://github.com/cli/cli (manual)
These gh_xxx.py scripts mostly use PyGithub. Scripts will require a GitHub access token that allows access the corresponding repos/organization.
Another tool that one can consider is gh API CLI tool; see the manual.
We can run interactively first at development time; for example:
>>> import util
>>> g = util.open_gitHub(token_file="/home/ssardina/.ssh/keys/gh-token-ssardina.txt")
>>> repo = g.get_repo("RMIT-COSC2780-2973-IDM25/workshop-5-ssardina")
>>> ws = repo.get_workflows()
>>> ws[0].create_dispatch(ref="main")Produces a CSV file with all the repos in a given GitHub Classroom for a particular assignment, using its corresponding prefix.
For example, to get all the repos submitted for AI24 project with prefix p3-prolog into a CSV file repos.csv:
python ./gh_classroom_collect.py -t ~/.ssh/keys/gh-token-ssardina.txt RMIT-COSC1127-1125-AI24 p3-prolog repos.csv |& tee -a repos.logNote
The CSV repo.csv file will be used for many later tasks, including cloning the repos locally using script git_clone_submissions.py, extracting collaborators, etc.
Produce a CSV file with how many commits each author has done per repo. For example:
$ python ./gh_authors_collect.py -t ~/.ssh/keys/github-token-ssardina.txt
--tag submission -- repos.csv authors.csvThe --tag option restricts to tags finishing in a given tag. If no tag is given, the whole repo is parsed to the head of main.
This script will push a template Wiki to each repo:
python ./gh_create_wiki.py repos.csv ./wiki-templateSee wiki-template for an example of a Wiki template.
This script will add a GH username to GH teams in an organization. For example, to add Axel to all the teams except teams teachers and headtutor:
$ python gh_member_bulk_team.py RMIT-COSC2780-2973-IDM24 axelahmer --nteams "teachers" "headtutor"
Running the script on: 2024-05-18-00-35-27
Sat, 18 May 2024 00:35:27 INFO Getting organization RMIT-COSC2780-2973-IDM24...
Sat, 18 May 2024 00:35:27 INFO Getting GH user for axelahmer...
Teams available: ['AI NPCs', 'ASP Dads', 'Galacticos', 'gASP', 'Harry Ron and Hermoine', 'IDM Project', 'Intellect Realm', 'Inter-Dimensional Masochists (IDM)', 'Logic Nexus', 'Lorem Ipsum', 'Mister World Wide', 'Prolog nightmares again', 'sajeevan', 'Super awesome team name', 'teachers', 'TRY']
Adding user **axelahmer** to team AI NPCs
Adding user **axelahmer** to team ASP Dads
Adding user **axelahmer** to team Galacticos
Adding user **axelahmer** to team gASP
Adding user **axelahmer** to team Harry Ron and Hermoine
Adding user **axelahmer** to team IDM Project
Adding user **axelahmer** to team Intellect Realm
Adding user **axelahmer** to team Inter-Dimensional Masochists (IDM)
Adding user **axelahmer** to team Logic Nexus
Adding user **axelahmer** to team Lorem Ipsum
Adding user **axelahmer** to team Mister World Wide
Adding user **axelahmer** to team Prolog nightmares again
Adding user **axelahmer** to team sajeevan
Adding user **axelahmer** to team Super awesome team name
Adding user **axelahmer** to team TRYThis script can be used to merge PRs in a set of repos. This is useful when a PR have been opened in each student repo to implement updates in the project spec via GitHub Classroom sync feature (new in 2024).
For example, to merge PR with title Sync from the 40th repo in repo.csv:
python ./gh_pr_merge.py repos.csv -t ~/.ssh/keys/gh-token-ssardina.txt --title Sync --start 40 |& tee -a merge_pr.logCheck which repos are missing an expected Feedback PR #1 from GitHub Classroom; and create them as needed. This may be needed because sometimes GH Classroom failed to create the PRs in some repos.
For example:
python ./gh_pr_create.py -t ~/.ssh/keys/gh-token-ssardina.txt repos.csv a7b3d7aee55d00d55ee29b8a505d17fc8283e9f8 |& tee pr_create.logNotice that the script needs the list of repos to consider ()repos.csv) and the base sha to create the Feedback branch and corresponding PR. Because the repos were created by GH Classroom, the first commit should have the same sha than the original staff template.
This script will check for PRs that have been merged and
python ../git-hw-submissions.git/gh_pr_check_merged_forced.py -t ~/.ssh/keys/gh-token-ssardina.txt repos.csvIt will leave two CSV files pr_merged.csv and pr_forced.csv with the corresponding repos' ids.
This tool will push comments (e.g., homework feedback and results) to PRs in GH repositories. This may be useful to provide feedback and results to students after automarking. It requires:
- A CSV file with the list of all relevant repos to process (e.g., student's projects).
- A CSV file with the marking results (points, marks, comments, etc).
- A folder with the automarking reports as text files.
- A Python file defining two functions that process a row in the marking results:
check_submission: can be used to check if the row contains a legal/successful submission. It will return whether the row/submission needs to be skipped and a string message to be posted to the PR, if any (e.g., the reason why the submission was not marked and skipped).report_feedback: produce the actual feedback text to be posted in the PR.FEEDBACK_MESSAGE: message to post after a report.get_repo()[OPTIONAL]: returns a list of repos to process.
Now push all feedback to their pull requests from fist row (1) to row 5:
python ./gh_pr_post_result.py repos.csv marking.csv feedback_p2.py reports -t ~/.ssh/keys/gh-token-ssardina.txtIt is best to use --dry-run first to test it.
Check repor builder examples:
gh_pr_post_result_example_marking.py: build a full message for an assignment result.gh_pr_post_result_example_message.py: simple message to post to some repos.
python ../tools/git-hw-submissions.git/gh_pr_post_comment.py -t ~/.ssh/keys/gh-token-ssardina.txt repos.csv message_pr.pyFile message_pr.py defines constant MESSAGE with placement ghu fr GH username, and function get_repo() which is a list of the relevant repo to posts (None if all).
This script can do several operations on workflows:
- Start a run of a workflow (create a dispatch) in the repository of students. This is usually an automarking workflow that is connected and reports to GitHub Classroom.
- In this way, we can decide when the workflow should run, rather than in each push (which will consume all the budget quickly!)
- Extract the URL to the HTML page of a job for a run of a workshop. This URL would be the automarker report in the repo (showing the automarking process and table of results).
- Note this URL is still accessible even if the actions are disabled.
- Delete workflow runs.
Examples:
# start dispatching workflows with name "Autograding" on the last commit before a date
$ python ../../tools/git-hw-submissions.git/gh_workflow.py -t ~/.ssh/keys/gh-token-ssardina.txt --name Autograding --until 2025-04-08T12:00 --run-name "Automarking up April 8 12pm" -- start repos.csv |& tee -a autograde-2025-04-08T1200.log
# get all the HTML URL to workflow job reports
$ python ../../tools/git-hw-submissions.git/gh_workflow.py -t ~/.ssh/keys/gh-token-ssardina.txt --name Autograding --run-name "Autograding Test" -repos baoly19,anurag060197,minhphamhuy -- jobs repos.csv
# delete all worflow runs after April 8, 2025 - 12pm
$ python gh_workflow.py -t ~/.ssh/keys/gh-token-ssardina.txt --name Autograding --until 2025-04-08T12:00 --repos ssardina -- delete repos.csvTo get the commits after a date (and the one just before):
python ../../tools/git-hw-submissions.git/gh_commits_after.py -t /home/ssardina/.ssh/keys/gh-token-ssardina.txt --ignore ssardina --since 2025-04-08T12:00 --repos minhphamhuy ssardina DeltaEchoVictor101 -- repos.csvThis was used to revert back to a previous commit before a deadline when the student has (illegally) push more changes after a deadline.
To get the repos that have tag commit after a given date:
$ python ./tools/git-teaching-tools.git/gh_tags_after.py repos.csv submission -t ~/.ssh/keys/gh-token-ssardina.txt --since 2025-10-15T23:00
$ cat tags-repos-2025-10-23_21-35.csv
REPO_ID_SUFFIX,TAG,COMMIT,DATE
lanadavies,submission,5dbac2b,2025-10-19T21:30:33+11:00
longcent,submission,5cd116c,2025-10-17T22:57:16+11:00
pugdev17,submission,b0568af,2025-10-15T23:53:25+11:00
rishabh-despacito,submission,0451595,2025-10-16T14:50:11+11:00The we can just clone those ones in a special folder (e.g., late submissions after main marking has been done).
Builds the section for the classroom-resources/autograding-grading-reporter@v1 runner from the definition of the tests. It is too cumbersome to do it manually! 😉
- name: Autograding Reporter
uses: classroom-resources/autograding-grading-reporter@v1
env:
LIVE_RESULTS: "${{steps.live.outputs.result}}"
MESSI_RESULTS: "${{steps.messi.outputs.result}}"
MAP-SOUND_RESULTS: "${{steps.map-sound.outputs.result}}"
MAP-OPTIMAL_RESULTS: "${{steps.map-optimal.outputs.result}}"
with:
runners: live,messi,map-sound,map-optimalAlso reports the total marks in the automarking.
An example of a run:
$ python ../../tools/git-hw-submissions.git/ghc_build_reporter.py workshop-4-ssardina.git/.github/workflows/classroom.yml
Total marks: 100 │* 2732b6e - (HEAD -> main, origin/main, origin/HEAD) Much better automarking; run-name and sha inputs (18 hours ago) <
│ssardina>
- name: Autograding Reporter │* 7f89ba2 - solving diet puzzle (4 days ago) <Dev Bakshi>
uses: classroom-resources/autograding-grading-reporter@v1 │* 77e99c8 - implementing class scheduling (4 days ago) <Dev Bakshi>
env: │* fa68569 - using set for predicates (4 days ago) <Dev Bakshi>
SHOP_4_RESULTS: ${{steps.shop_4.outputs.result}} │* 1c283a1 - adding shop for iten predicate (4 days ago) <Dev Bakshi>
SHOPS_FOR_ITEM_2_RESULTS: ${{steps.shops_for_item_2.outputs.result}} │* b81a363 - add deadline (3 weeks ago) <github-classroom[bot]>
SHOPS_FOR_ITEMS_2_RESULTS: ${{steps.shops_for_items_2.outputs.result}} │* 726a5f8 - Setting up GitHub Classroom Feedback (3 weeks ago) <github-classroom[bot]>
INTERSECTION_3_RESULTS: ${{steps.intersection_3.outputs.result}} │* 4ec8cc2 - (origin/feedback) GitHub Classroom Feedback (3 weeks ago) <github-classroom[bot]>
DIFF_3_RESULTS: ${{steps.diff_3.outputs.result}} │* 1c26843 - Initial release IDM25 (3 weeks ago) <ssardina>
UNION_3_RESULTS: ${{steps.union_3.outputs.result}} │❯ gitlog
WHERE-LIVE_RESULTS: ${{steps.where-live.outputs.result}} │* b98e5ff - (HEAD -> main, origin/main, origin/HEAD) Revert "adding shop for iten predicate" (6 seconds ago) <ssardina
MAP_COLORING_RESULTS: ${{steps.map_coloring.outputs.result}} │>
CLASS_SCHEDULING_RESULTS: ${{steps.class_scheduling.outputs.result}} │* 2732b6e - Much better automarking; run-name and sha inputs (19 hours ago) <ssardina>
DIET_RESULTS: ${{steps.diet.outputs.result}} │* 7f89ba2 - solving diet puzzle (4 days ago) <Dev Bakshi>
DIET-Q1_RESULTS: ${{steps.diet-q1.outputs.result}} │* 77e99c8 - implementing class scheduling (4 days ago) <Dev Bakshi>
DIET-Q2_RESULTS: ${{steps.diet-q2.outputs.result}} │* fa68569 - using set for predicates (4 days ago) <Dev Bakshi>
DIET-Q3_RESULTS: ${{steps.diet-q3.outputs.result}} │* 1c283a1 - adding shop for iten predicate (4 days ago) <Dev Bakshi>
with: │* b81a363 - add deadline (3 weeks ago) <github-classroom[bot]>
runners: shop_4,shops_for_item_2,shops_for_items_2,intersection_3,diff_3,union_3,where-live,map_coloring,class_sch│* 726a5f8 - Setting up GitHub Classroom Feedback (3 weeks ago) <github-classroom[bot]>
eduling,diet,diet-q1,diet-q2,diet-q3Then one can copy and paste this in the classroom.yaml workflow file.
This script allows to get all repos in an organization and its contributors with their acceses (read/write/admin).
python ../tools/git-hw-submissions.git/gh_user_access.py -t ~/.ssh/keys/gh-token-ssardina.txt list RMIT-COSC2780-2973-IDM25 ssardina -s 1 -e 10 |& tee -a 2025.04.13.repo-org.logThis could be good to later inspect and handle users who have dropped the course (may want to remove their access to the repos).
This script allows to get the current list of issue labels from a repo, as a JSON file, or push a given list of labels, including replacing the labels completely.
# get the current list of labels into labels.json
$ python gh_issue_labels.py get harry-honours-2025/honours-software -t jasdlakjsdlj1223d --file labels.json
# push completely the set of labels
$ python gh_issue_labels.py push harry-honours-2025/honours-software -tf ~/.ssh/keys/gh-token-ssardina.txt --file labels.json --replaceWhen using GitHub Classroom, seems some students get the repo created but have no access and their invites do not work. They are still showing as "pending invitation". There is an discssion on this here. It seems the problem is that GitHub Classroom invitations expire before students can accept them 😕. While we all hope GitHub resolves this soon, I built a temporary workaround that has already helped our students regain access quickly.
The suggestion is to refresh the invitation: delete and re-invite. Doing it manual (for many students) become time consuming, so the script gh_refresh_invite.sh allows to quickly do the refresh.
The script is a bash script and requires install and authettication of GitHub CLI. Then you can run:
$ refresh_invite.sh RMIT-COSC1127-3117-AI25 p2-prolog-garimlaw
Processing invitation for user: garimlaw
-> Deleting invitation ID 291199199 from RMIT-COSC1127-3117-AI25/p2-prolog-garimlaw...
-> Re-creating invitation for user: garimlawThese tools use GitPython module to have Git API in Python.
We can run interactively first at development time; for example:
>>> import util
>>> g = util.open_gitHub(token_file="/home/ssardina/.ssh/keys/gh-token-ssardina.txt")
>>> repo = g.get_repo("RMIT-COSC2780-2973-IDM25/workshop-5-ssardina")
>>> ws = repo.get_workflows()
>>> ws[0].create_dispatch(ref="main")This tool clones a set of student/team repositories listed in a CSV file at a given tagged commit. The CVS file should contain the team name (under column name TEAM) and a GIT ssh link (under column name GIT-URL).
If a repository already exists, it will be updated automatically:
- if the tag changed to a different commit, the new commit will be pulled;
- if the repo does not have the tag anymore (the student has withdraw the submission), the local copy will be removed from disk.
At the end, the script produces a CSV file with the information of each repo successfully cloned, including commit id (commit), time of the commit (submitted_at), and time of the tagging (tagged_at).
For example, to clone Project 0 at commit with tag "final" using the database of repos repos.csv:
python ./git_clone_submissions.py --file-timestamps timestamps.csv repos.csv final submissions/ &| tee clone.logAll repos will be cloned within folder submissions/ and the file timestamps.csv will contain the timestamps and commits of each repo cloned successfully. The file will contain the date of the commit linked to the tag. If the tag is an annotated tag (and not just lightweight tag), it will also include the date tagged (otherwise they will be assumed the same); see annotated vs lightweight tags.
To just clone the last commit in the master branch, use main as the tag.
The timezone used is defined by constant TIMEZONE in the script (defaults to Australia/Melbourne time zone).
This script allows to commit and push changes to a collection of repos that already exist in a folder. This is useful to make edits to students' repos after they have been created.
Sometime we want to revert back to some previous commit, for example, if the student has done late work which has already been autograder by the workflow.
python ../../tools/git-hw-submissions.git/git_revert.py submissions/deltaechovictor101/ b81a363 --keep .githubNote
Generally you get the commit to go back by getting the last commit until a certain date (deadline). You can use script gh_commits_after.py to get that for a collection of repos.
To access Google Workspaces resources via the Google API, one needs to enable the API access and get a proper authentication credentials. Different APIs are provided for teh different resources (drive, gmail, sheet, etc.).
For authentication to Google Workspaces one needs the the application configurations file from APIs Console. Check the Google Sheet API setup process to obtain a client_secret.json (same as credentials.json) file for your project. PyDrive2 also explains how to get the file here.
All access to Google API requires authentication; usually the workflow is as follows:
- Go to the Google API Access Panel.
- Create a project.
- Create an OAuth consent screen.
- Create credentials of type "OAuth Client ID".
- Download the JSON file of such credentials and name it
client_secrets.json - Place the file in the same directory as the scripts.
Tip
Read Integrating Google Drive API with Python: A Step-by-Step Guide.
- Google Sheet API access:
- Google Drive API access:
- PyDrive2 package for more abstract access to the .
- Google Forms API:
This script allows to download a Google Sheet from the cloud. It relies on the gsheets package to access the Google Sheet API.
The sheet to download is often a marking sheet. For example, to get the sheet MARKING from Google Spreadsheet with id 1kX-fa3_DMNDQROUr1Y-cG89UksTUUqlYdrNcV1yN6NA:
python ./gg_get_worksheet.py 1kX-fa3_DMNDQROUr1Y-cG89UksTUUqlYdrNcV1yN6NA MARKING -c ~/.ssh/keys/credentials.json -o marking.csvThe credentials.json was obtained via the Google Sheet API setup process. The first time that is used, a permission will be required. A certificate file storage.json will be left then that will allow direct access from there on (until certificate expires).
The authentication will be done via console. Use --webserver to open an actual browser.
This script can process Google Sheets produced by Google Forms, and download links to uploaded files in each submission. Files will be placed in folders identifying each submission, for example with the student number or email associated to the submission.
For example, to download the files recorded in column G in the worksheet W4-resolution of a Google Sheet with id 1ogCRbx..., and save each in a folder submission/XXXX where XXXX is the student number recorded in column C:
python ./gg_sheet_submissions.py 1ogCRbxB3seVhoqhD7iBmVZ8EdpvGEB94oLowDIs5s2g W4-resolution tea --output test/submissions --file-name test.cnf --column-id C --column-file GThis tool allows downloading file submissions in a Google Drive folder, usually submitted via a Google Form by students. It uses PyDrive2 package for more abstract access to the Google Drive API.
python ./gg_drive_download.py 1mttf61NwuFNY25idwWw5AKzV3tQQbwlMp980i-9vnkK3PIV4o7ZOtykWvjM-VLqmHuYJ0jX4 -c ~/.ssh/keys/credentials.json --output test/submissions --file-name submissions.cnfThis script is a new and simpler version of the one in this repo.
Once all git repos have been cloned in git-submissions/, one can build zip files from the submissions into directory zip-submissions/ as follows:
for d in git-submissions-p2/*; do echo "============> Processing ${d}" ; zip -q -j "./zip-submissions-p2/`basename "$d.zip"`" "${d}"/p2-multiagent/* ;doneor for the final CTF project:
for d in git-submissions-p4/*; do echo "============> Processing ${d}" ; zip -q -j "./zip-submissions-p4/`basename "$d.zip"`" "${d}"/pacman-contest/* ;doneTo count the number of commits between dates:
git log --after="2018-03-26T00:00:00+11:00" --before="2018-03-28T00:00:00+11:00" | grep "Date:" | wc -lTo copy just the new zip files:
rsync -avt --ignore-existing zip-submissions-p4/*.zip AI18-assessments/project-4/zip-submissions/- Prof. Sebastian Sardina (ssardina@gmail.com)
- Andrew Chester (head tutor in AI - 2020-2024)
- Jonathon Belotti (helped in AI-2018)