Simple python script that loads data from GDELT dataset to SAP HANA DB table.
- Add path to python that is shipped with "SAP HANA Client" to PATH variable.
- Copy 3 files(,, from the 'hdbcli' folder to the "SAP HANA Client"'s Python's folder called 'Lib'.
- Copy 2 files(pyhdbcli.pdb, pyhdbcli.pyd) from the 'hdbclient' folder to the "SAP HANA Client"'s Python's folder 'Lib'.
- Additional (
- Table structure is taken from GDELT table definition -
- To create table in SAP HANA script use following script 'gdelt_dailyupdates.hdbtable'
- Directory "data" contains python script that fetches daily data updates(interval can be specified) from GDELT website and stores and upzips them on your PC.
- Move to the subdirectory data
- Run on bat file
- in command line (for only daily updates)
python fetch_from_date -d "../zipped" -U -du "../unzipped"
- in command line (fromdate <option -F and date in format 'YYYYMMDD'>, todate <option -T and date in format 'YYYYMMDD'>)
python fetch_from_date -d "../zipped" -U -du "../unzipped" -F 20140321
- Create file ''
- Copy->Paste code below and insert your credentials
# Server
SERVER = '<server>'
PORT = <port>
# User Credentials
USER = '<user>'
PASSWORD = '<password>'
Applications <port> should be 3<instance number>15. For example, 30015, if the instance is 00.
To main python script on windows machine you can use 'run.bat'. Note: (a) all configurations must be performed before script can be executed properly;
- [FIXED] Not all event from daily updates are parsed properly (some shift in data is possible);
- [FIXED] All fields(if generated from SAP HANA .hdbtable) are generated as 'NVARCHAR' data type;
- Code for downloading zip files with GDELT data was written by John Beieler ( and forked from (