This notebook gives information about the function used to fetch Github repositories based on a keyword and a determined timeframe starting from the day of fetching.
- Generate your own Github Token: Creating a personal access token
- Save your token as an enviroment variable, remember to name the variable as 'GITHUBTOKEN': Configuring Environment Variables
- Make sure you have installed the following packages in python: requests, math, datetime, dateutil, csv, pandas, json, os, time. Installation instruction can be found at Python website
GRF collects data by operating 4 separate steps accquired via 4 functions: find_repo, export_repo_list; save_column; save_dt. The working of these functions is illustrated in the following diagram:
A sample dataset was obtained by using the following command:
grf("python", 3, 8)
- Detailed datasheet can be found here: Datasheet for GRF sample dataset
- Below is a preview of the final dataset:
import pandas as pd
pd.read_csv("data/dt.csv", delimiter= ";",nrows=10)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
id | name | url | language | created | stars | watch | forks | readme | |
---|---|---|---|---|---|---|---|---|---|
0 | 416977797 | AI_Project | https://github.com/pbl4team/AI_Project | Python | 2021-10-14T03:37:01Z | 0 | 0 | 4 | Project AI Systeam - Computer Vision with pyth... |
1 | 416995331 | python | https://github.com/Cam0411/python | Python | 2021-10-14T05:06:00Z | 0 | 0 | 0 | python |
2 | 416908634 | Python | https://github.com/psplendid61/Python | NaN | 2021-10-13T21:53:43Z | 0 | 0 | 0 | NaN |
3 | 416963376 | python | https://github.com/iAMSe/python | NaN | 2021-10-14T02:28:55Z | 0 | 0 | 0 | python |
4 | 416996896 | python | https://github.com/rakeshk67/python | NaN | 2021-10-14T05:13:42Z | 0 | 0 | 0 | python |
5 | 416961346 | python | https://github.com/colddie/python | NaN | 2021-10-14T02:19:55Z | 0 | 0 | 0 | NaN |
6 | 416990435 | Python | https://github.com/mahdidahmani/Python | Python | 2021-10-14T04:41:11Z | 0 | 0 | 0 | NaN |
7 | 416952467 | python | https://github.com/grace-th3/python | Python | 2021-10-14T01:38:42Z | 0 | 0 | 0 | NaN |
8 | 416928589 | Python | https://github.com/Cheung-man/Python | NaN | 2021-10-13T23:36:27Z | 0 | 0 | 0 | NaN |
9 | 416935834 | python | https://github.com/mygithuang/python | NaN | 2021-10-14T00:15:25Z | 0 | 0 | 0 | python mygithuang |
Source code and detailed function documentation are available at: GRF Source code