Skip to content

ssardina-teaching/git-hw-submissions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Teaching scripts

This repo contains useful scripts I developed over the years (since 2015!) to support student's submissions for homeworks and projects. They include tools to manage Git and GitHub repositories, Google Workspaces (drive, spreadsheets), and shell/filesystem scripts.

They are almost all Python-based script, using two API libraries:

  • GitPython (scripts git_-_xxx.py): to perform GIT operations on the filesystem (e.g., cloning, commiting and pushing changes, reverting)
  • PyGitHub (scripts gh_xxx.py): to perform API calls to GitHub REST API.
  • Various libraries for accessing Google resorces (spreadsheets, Drive, etc.). These scripts are named gg_xxx.py

Feel free to use them as desired. No guarantees and I am sure they will have bugs or out-dated code! Open an issue or PR as needed. 😉

Setup

To install all requirements:

$ sudo pip install -r requirements.txt

The libraries used are:

GitHub Scripts

These gh_xxx.py scripts mostly use PyGithub. Scripts will require a GitHub access token that allows access the corresponding repos/organization.

Another tool that one can consider is gh API CLI tool; see the manual.

Manual testing at development

We can run interactively first at development time; for example:

>>> import util
>>> g = util.open_gitHub(token_file="/home/ssardina/.ssh/keys/gh-token-ssardina.txt")
>>> repo = g.get_repo("RMIT-COSC2780-2973-IDM25/workshop-5-ssardina")
>>> ws = repo.get_workflows()
>>> ws[0].create_dispatch(ref="main")

gh_classroom_collect.py: collect repos from a GH Organizations

Produces a CSV file with all the repos in a given GitHub Classroom for a particular assignment, using its corresponding prefix.

For example, to get all the repos submitted for AI24 project with prefix p3-prolog into a CSV file repos.csv:

$  python ./gh_classroom_collect.py -t ~/.ssh/keys/gh-token-ssardina.txt RMIT-COSC1127-1125-AI24  p3-prolog repos.csv |& tee -a repos.log

Note

The CSV repo.csv file will be used for many later tasks, including cloning the repos locally using script git_clone_submissions.py, extracting collaborators, etc.

gh_authors_collect.py: extract commits per author

Produce a CSV file with how many commits each author has done per repo. For example:

$ python ./gh_authors_collect.py -t ~/.ssh/keys/github-token-ssardina.txt 
    --tag submission -- repos.csv authors.csv

The --tag option restricts to tags finishing in a given tag. If no tag is given, the whole repo is parsed to the head of main.

gh_create_wiki.py: push Wiki template to list of repos

This script will push a template Wiki to each repo:

$ python ./gh_create_wiki.py repos.csv ./wiki-template

See wiki-template for an example of a Wiki template.

gh_member_bulk_team.py: add/delete GH username to GH teams

This script will add a GH username to GH teams in an organization. For example, to add Axel to all the teams except teams teachers and headtutor:

$ python gh_member_bulk_team.py RMIT-COSC2780-2973-IDM24  axelahmer  --nteams  "teachers" "headtutor"
Running the script on: 2024-05-18-00-35-27
Sat, 18 May 2024 00:35:27 INFO     Getting organization RMIT-COSC2780-2973-IDM24...
Sat, 18 May 2024 00:35:27 INFO     Getting GH user for axelahmer...
Teams available: ['AI NPCs', 'ASP Dads', 'Galacticos', 'gASP', 'Harry Ron and Hermoine', 'IDM Project', 'Intellect Realm', 'Inter-Dimensional Masochists (IDM)', 'Logic Nexus', 'Lorem Ipsum', 'Mister World Wide', 'Prolog nightmares again', 'sajeevan', 'Super awesome team name', 'teachers', 'TRY']
Adding user **axelahmer** to team AI NPCs
Adding user **axelahmer** to team ASP Dads
Adding user **axelahmer** to team Galacticos
Adding user **axelahmer** to team gASP
Adding user **axelahmer** to team Harry Ron and Hermoine
Adding user **axelahmer** to team IDM Project
Adding user **axelahmer** to team Intellect Realm
Adding user **axelahmer** to team Inter-Dimensional Masochists (IDM)
Adding user **axelahmer** to team Logic Nexus
Adding user **axelahmer** to team Lorem Ipsum
Adding user **axelahmer** to team Mister World Wide
Adding user **axelahmer** to team Prolog nightmares again
Adding user **axelahmer** to team sajeevan
Adding user **axelahmer** to team Super awesome team name
Adding user **axelahmer** to team TRY

gh_pr_merge.py: bulk merge of PRs

This script can be used to merge PRs in a set of repos. This is useful when a PR have been opened in each student repo to implement updates in the project spec via GitHub Classroom sync feature (new in 2024).

For example, to merge PR with title Sync from the 40th repo in repo.csv:

$ python ./gh_pr_merge.py repos.csv -t ~/.ssh/keys/gh-token-ssardina.txt --title Sync --start 40 |& tee -a merge_pr.log

gh_pr_feedback_create.py: create Feedback PRs

Check which repos are missing an expected Feedback PR #1 from GitHub Classroom; and create them as needed. This may be needed because sometimes GH Classroom failed to create the PRs in some repos.

For example:

$ python ./gh_pr_create.py -t ~/.ssh/keys/gh-token-ssardina.txt repos.csv a7b3d7aee55d00d55ee29b8a505d17fc8283e9f8 |& tee pr_create.log

Notice that the script needs the list of repos to consider ()repos.csv) and the base sha to create the Feedback branch and corresponding PR. Because the repos were created by GH Classroom, the first commit should have the same sha than the original staff template.

gh_pr_check_merged_forced.py: check for merged PR and forced pushes

This script will check for PRs that have been merged and

$ python ../git-hw-submissions.git/gh_pr_check_merged_forced.py  -t ~/.ssh/keys/gh-token-ssardina.txt   repos.csv

It will leave two CSV files pr_merged.csv and pr_forced.csv with the corresponding repos' ids.

gh_pr_post_result.py: push comments to repo's PRs

This tool will push comments (e.g., homework feedback and results) to PRs in GH repositories. This may be useful to provide feedback and results to students after automarking. It requires:

  • A CSV file with the list of all relevant repos to process (e.g., student's projects).
  • A CSV file with the marking results (points, marks, comments, etc).
  • A folder with the automarking reports as text files.
  • A Python file defining two functions that process a row in the marking results:
    • check_submission: can be used to check if the row contains a legal/successful submission. It will return whether the row/submission needs to be skipped and a string message to be posted to the PR, if any (e.g., the reason why the submission was not marked and skipped).
    • report_feedback: produce the actual feedback text to be posted in the PR.
    • FEEDBACK_MESSAGE: message to post after a report.
    • get_repo() [OPTIONAL]: returns a list of repos to process.

Now push all feedback to their pull requests from fist row (1) to row 5:

$ python ./gh_pr_post_result.py repos.csv marking.csv feedback_p2.py reports  -t ~/.ssh/keys/gh-token-ssardina.txt

It is best to use --dry-run first to test it.

Check repor builder examples:

  • gh_pr_post_result_example_marking.py: build a full message for an assignment result.
  • gh_pr_post_result_example_message.py: simple message to post to some repos.

gh_pr_post_comment.py: push a message to PRs

$ python ../tools/git-hw-submissions.git/gh_pr_post_comment.py -t ~/.ssh/keys/gh-token-ssardina.txt repos.csv message_pr.py

File message_pr.py defines constant MESSAGE with placement ghu fr GH username, and function get_repo() which is a list of the relevant repo to posts (None if all).

gh_workflow.py: run automarking workflow

This script can do several operations on workflows:

  1. Start a run of a workflow (create a dispatch) in the repository of students. This is usually an automarking workflow that is connected and reports to GitHub Classroom.
    • In this way, we can decide when the workflow should run, rather than in each push (which will consume all the budget quickly!)
  2. Extract the URL to the HTML page of a job for a run of a workshop. This URL would be the automarker report in the repo (showing the automarking process and table of results).
    • Note this URL is still accessible even if the actions are disabled.
  3. Delete workflow runs.

Examples:

# start dispatching workflows with name "Autograding" on the last commit before a date
$ python ../../tools/git-hw-submissions.git/gh_workflow.py -t ~/.ssh/keys/gh-token-ssardina.txt --name Autograding --until 2025-04-08T12:00 --run-name "Automarking up April 8 12pm" -- start repos.csv |& tee -a autograde-2025-04-08T1200.log

# get all the HTML URL to workflow job reports
$ python ../../tools/git-hw-submissions.git/gh_workflow.py -t ~/.ssh/keys/gh-token-ssardina.txt --name Autograding --run-name "Autograding Test" -repos baoly19,anurag060197,minhphamhuy -- jobs repos.csv                    

# delete all worflow runs after April 8, 2025 - 12pm
$ python gh_workflow.py -t ~/.ssh/keys/gh-token-ssardina.txt --name Autograding --until 2025-04-08T12:00 --repos ssardina -- delete repos.csv

gh_commit_after.py: get commits after a date

To get the commits after a date (and the one just before):

$ python ../../tools/git-hw-submissions.git/gh_commits_after.py -t /home/ssardina/.ssh/keys/gh-token-ssardina.txt  --ignore ssardina --since 2025-04-08T12:00 --repos minhphamhuy ssardina DeltaEchoVictor101  -- repos.csv

This was used to revert back to a previous commit before a deadline when the student has (illegally) push more changes after a deadline.

ghc_build_reporter.py: build YAML classroom reporter

Builds the section for the classroom-resources/autograding-grading-reporter@v1 runner from the definition of the tests. It is too cumbersome to do it manually! 😉

    - name: Autograding Reporter
      uses: classroom-resources/autograding-grading-reporter@v1
      env:
        LIVE_RESULTS: "${{steps.live.outputs.result}}"
        MESSI_RESULTS: "${{steps.messi.outputs.result}}"
        MAP-SOUND_RESULTS: "${{steps.map-sound.outputs.result}}"
        MAP-OPTIMAL_RESULTS: "${{steps.map-optimal.outputs.result}}"
      with:
        runners: live,messi,map-sound,map-optimal

Also reports the total marks in the automarking.

An example of a run:

$ python ../../tools/git-hw-submissions.git/ghc_build_reporter.py workshop-4-ssardina.git/.github/workflows/classroom.yml
Total marks:  100                                                                                                     │* 2732b6e - (HEAD -> main, origin/main, origin/HEAD) Much better automarking; run-name and sha inputs (18 hours ago) <
                                                                                                                      │ssardina>
- name: Autograding Reporter                                                                                          │* 7f89ba2 - solving diet puzzle (4 days ago) <Dev Bakshi>
  uses: classroom-resources/autograding-grading-reporter@v1                                                           │* 77e99c8 - implementing class scheduling (4 days ago) <Dev Bakshi>
  env:                                                                                                                │* fa68569 - using set for predicates (4 days ago) <Dev Bakshi>
        SHOP_4_RESULTS: ${{steps.shop_4.outputs.result}}                                                              │* 1c283a1 - adding shop for iten predicate (4 days ago) <Dev Bakshi>
        SHOPS_FOR_ITEM_2_RESULTS: ${{steps.shops_for_item_2.outputs.result}}                                          │* b81a363 - add deadline (3 weeks ago) <github-classroom[bot]>
        SHOPS_FOR_ITEMS_2_RESULTS: ${{steps.shops_for_items_2.outputs.result}}                                        │* 726a5f8 - Setting up GitHub Classroom Feedback (3 weeks ago) <github-classroom[bot]>
        INTERSECTION_3_RESULTS: ${{steps.intersection_3.outputs.result}}                                              │* 4ec8cc2 - (origin/feedback) GitHub Classroom Feedback (3 weeks ago) <github-classroom[bot]>
        DIFF_3_RESULTS: ${{steps.diff_3.outputs.result}}                                                              │* 1c26843 - Initial release IDM25 (3 weeks ago) <ssardina>
        UNION_3_RESULTS: ${{steps.union_3.outputs.result}}                                                            │❯ gitlog
        WHERE-LIVE_RESULTS: ${{steps.where-live.outputs.result}}                                                      │* b98e5ff - (HEAD -> main, origin/main, origin/HEAD) Revert "adding shop for iten predicate" (6 seconds ago) <ssardina
        MAP_COLORING_RESULTS: ${{steps.map_coloring.outputs.result}}                                                  │>
        CLASS_SCHEDULING_RESULTS: ${{steps.class_scheduling.outputs.result}}                                          │* 2732b6e - Much better automarking; run-name and sha inputs (19 hours ago) <ssardina>
        DIET_RESULTS: ${{steps.diet.outputs.result}}                                                                  │* 7f89ba2 - solving diet puzzle (4 days ago) <Dev Bakshi>
        DIET-Q1_RESULTS: ${{steps.diet-q1.outputs.result}}                                                            │* 77e99c8 - implementing class scheduling (4 days ago) <Dev Bakshi>
        DIET-Q2_RESULTS: ${{steps.diet-q2.outputs.result}}                                                            │* fa68569 - using set for predicates (4 days ago) <Dev Bakshi>
        DIET-Q3_RESULTS: ${{steps.diet-q3.outputs.result}}                                                            │* 1c283a1 - adding shop for iten predicate (4 days ago) <Dev Bakshi>
  with:                                                                                                               │* b81a363 - add deadline (3 weeks ago) <github-classroom[bot]>
    runners: shop_4,shops_for_item_2,shops_for_items_2,intersection_3,diff_3,union_3,where-live,map_coloring,class_sch│* 726a5f8 - Setting up GitHub Classroom Feedback (3 weeks ago) <github-classroom[bot]>
eduling,diet,diet-q1,diet-q2,diet-q3          

Then one can copy and paste this in the classroom.yaml workflow file.

gh_user_access.py: get repos and accesses of org

This script allows to get all repos in an organization and its contributors with their acceses (read/write/admin).

$ python ../tools/git-hw-submissions.git/gh_user_access.py -t ~/.ssh/keys/gh-token-ssardina.txt list RMIT-COSC2780-2973-IDM25 ssardina -s 1 -e 10 |& tee -a 2025.04.13.repo-org.log

This could be good to later inspect and handle users who have dropped the course (may want to remove their access to the repos).

Git Tools

These tools use GitPython module to have Git API in Python.

We can run interactively first at development time; for example:

>>> import util
>>> g = util.open_gitHub(token_file="/home/ssardina/.ssh/keys/gh-token-ssardina.txt")
>>> repo = g.get_repo("RMIT-COSC2780-2973-IDM25/workshop-5-ssardina")
>>> ws = repo.get_workflows()
>>> ws[0].create_dispatch(ref="main")

git_clone_submissions.py: batch git cloning

This tool clones a set of student/team repositories listed in a CSV file at a given tagged commit. The CVS file should contain the team name (under column name TEAM) and a GIT ssh link (under column name GIT-URL).

If a repository already exists, it will be updated automatically:

  • if the tag changed to a different commit, the new commit will be pulled;
  • if the repo does not have the tag anymore (the student has withdraw the submission), the local copy will be removed from disk.

At the end, the script produces a CSV file with the information of each repo successfully cloned, including commit id (commit), time of the commit (submitted_at), and time of the tagging (tagged_at).

For example, to clone Project 0 at commit with tag "final" using the database of repos repos.csv:

$ python ./git_clone_submissions.py --file-timestamps timestamps.csv repos.csv final submissions/ &| tee clone.log

All repos will be cloned within folder submissions/ and the file timestamps.csv will contain the timestamps and commits of each repo cloned successfully. The file will contain the date of the commit linked to the tag. If the tag is an annotated tag (and not just lightweight tag), it will also include the date tagged (otherwise they will be assumed the same); see annotated vs lightweight tags.

To just clone the last commit in the master branch, use main as the tag.

The timezone used is defined by constant TIMEZONE in the script (defaults to Australia/Melbourne time zone).

git_batch_commit.sh: bulk commit and push to repos

This script allows to commit and push changes to a collection of repos that already exist in a folder. This is useful to make edits to students' repos after they have been created.

git_revert.py: revert commits done late

Sometime we want to revert back to some previous commit, for example, if the student has done late work which has already been autograder by the workflow.

$ python ../../tools/git-hw-submissions.git/git_revert.py submissions/deltaechovictor101/ b81a363 --keep .github

Note

Generally you get the commit to go back by getting the last commit until a certain date (deadline). You can use script gh_commits_after.py to get that for a collection of repos.

Google Scripts

To access Google Workspaces resources via the Google API, one needs to enable the API access and get a proper authentication credentials. Different APIs are provided for teh different resources (drive, gmail, sheet, etc.).

For authentication to Google Workspaces one needs the the application configurations file from APIs Console. Check the Google Sheet API setup process to obtain a client_secret.json (same as credentials.json) file for your project. PyDrive2 also explains how to get the file here.

All access to Google API requires authentication; usually the workflow is as follows:

  • Go to the Google API Access Panel.
  • Create a project.
  • Create an OAuth consent screen.
  • Create credentials of type "OAuth Client ID".
  • Download the JSON file of such credentials and name it client_secrets.json
  • Place the file in the same directory as the scripts.

gg_get_worksheet.py: download Google Sheet worksheet as CSV file

This script allows to download a Google Sheet from the cloud. It relies on the gsheets package to access the Google Sheet API.

The sheet to download is often a marking sheet. For example, to get the sheet MARKING from Google Spreadsheet with id 1kX-fa3_DMNDQROUr1Y-cG89UksTUUqlYdrNcV1yN6NA:

$ python ./gg_get_worksheet.py 1kX-fa3_DMNDQROUr1Y-cG89UksTUUqlYdrNcV1yN6NA MARKING -c ~/.ssh/keys/credentials.json -o marking.csv

The credentials.json was obtained via the Google Sheet API setup process. The first time that is used, a permission will be required. A certificate file storage.json will be left then that will allow direct access from there on (until certificate expires).

The authentication will be done via console. Use --webserver to open an actual browser.

gg_sheet_submissions.py: download submissions from Google Sheets

This script can process Google Sheets produced by Google Forms, and download links to uploaded files in each submission. Files will be placed in folders identifying each submission, for example with the student number or email associated to the submission.

For example, to download the files recorded in column G in the worksheet W4-resolution of a Google Sheet with id 1ogCRbx..., and save each in a folder submission/XXXX where XXXX is the student number recorded in column C:

$ python ./gg_sheet_submissions.py 1ogCRbxB3seVhoqhD7iBmVZ8EdpvGEB94oLowDIs5s2g W4-resolution tea --output test/submissions --file-name test.cnf --column-id C --column-file G

gg_drive_download.py: download files in Drive folder

This tool allows downloading file submissions in a Google Drive folder, usually submitted via a Google Form by students. It uses PyDrive2 package for more abstract access to the Google Drive API.

$ python ./gg_drive_download.py 1mttf61NwuFNY25idwWw5AKzV3tQQbwlMp980i-9vnkK3PIV4o7ZOtykWvjM-VLqmHuYJ0jX4 -c ~/.ssh/keys/credentials.json --output test/submissions --file-name submissions.cnf

This script is a new and simpler version of the one in this repo.

Useful shell commands

Once all git repos have been cloned in git-submissions/, one can build zip files from the submissions into directory zip-submissions/ as follows:

for d in git-submissions-p2/*; do echo "============> Processing ${d}" ; zip -q -j "./zip-submissions-p2/`basename "$d.zip"`" "${d}"/p2-multiagent/* ;done

or for the final CTF project:

for d in git-submissions-p4/*; do echo "============> Processing ${d}" ; zip -q -j "./zip-submissions-p4/`basename "$d.zip"`" "${d}"/pacman-contest/* ;done

To count the number of commits between dates:

git log --after="2018-03-26T00:00:00+11:00" --before="2018-03-28T00:00:00+11:00" | grep "Date:" | wc -l

To copy just the new zip files:

rsync  -avt --ignore-existing  zip-submissions-p4/*.zip AI18-assessments/project-4/zip-submissions/

Contributors