Python Pandas extensions for pandas dataframes
import pandas_ext as px
By default, pandas will natively read to s3 but won't write to s3.
px.read_csv
px.to_csv
By default, pandas will natively read to s3 but won't write to s3.
To write to xls:
pip install pandas_ext[xls]
To write xlsx:
pip install pandas_ext[xlsx]
By default, pandas does not read/write to Gdrive.
At the organization level, one must do the necessary installation and deployment of the gdrive lambda service to get this to work.
You will have to create a service account that shares your G Apps domain with the following APIs enabled:
-
Google Drive
-
Google Sheets
From there you must download the p12 credentials file and reference it in your settings.yml when deploying the gdrive service.
Once that is complete, you must share the folder you are interested in reading/writing to the service account email you've received from your administrator.
Locally, for client access you must set the GDRIVE_URL
and GDRIVE_KEY
in your projects environment variable in order to talk to
the gdrive lambda service.
px.read_gdrive
px.to_gdrive
By default, pandas does not read/write to Parquet. This has been added in pandas version 24 and my methods will eventually update to use them but still allow writing to s3.
px.read_parquet
px.to_parquet
to_spectrum is unique to pandas_ext.
px.to_spectrum
salesforce methods are unique to pandas_ext.
px.read_sfdc
px.sfdc_metadata
px.patch_sfdc
px.async_patch_sfdc
px.read_sql
px.list_backends
## XML
Pandas doesn't natively support writing to XML format.
px.to_xml