Description
For the final stage of ETL is to commit changes to remote (Google Sheets), no method was found to work.
There are no exceptions, just a function call to remote
and no data state update.
For the save/commit to the remote (i..e the stateful data source): 3 attempts at three approaches to get the local dataset to either|
- INTEGRATE Overwrite using the specialist gspread_dataframe library as per https://gspread-dataframe.readthedocs.io/en/latest/#gspread_dataframe.set_with_dataframe & PerplexityAI
https://www.perplexity.ai/search/2a256f73-47dc-4117-bbab-87e4c0a7cbe1?s=c
2 UPDATE: Overwrite using the basic gspread sheet.update as per
https://docs.gspread.org/en/latest/user-guide.html#using-gspread-with-pandas
- INJECTION: Inject a series data, which is the main data object for ETL, AND with support from Perplexity
https://www.perplexity.ai/search/33fb1a34-54aa-49d4-84aa-45b5d846eba8?s=c
[pycriteria] controller.py (Lines 2031-2032)
def save(self, saved: pd.DataFrame, series: pd.Series, index: int, action: str, debug: bool = False) -> None:
""" Saves the dataframe and commits it to the remote source"""
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2033-2041)
# 1. Prompt the user to save the updated DataFrame
if click.confirm("Are you ready to commit changes?"):
sheet: gspread.Worksheet = Controller.load_wsheet()
# 2. Check for Validation Client and Worksheet ID presence
if sheet.client is not None and \
isinstance(sheet.client, gspread.Client):
if sheet.id is not None and \
sheet.get_all_records():
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2042-2064)
# 3. Convert sheet to a target DataFrame
target: pd.DataFrame = \
DataController.load_dataframe_wsheet(sheet)
# 4. SAVE ATTEMPT 1: INTEGRATE a single record into the target
integratedframe: pd.DataFrame = self.integrate(
single=saved,
source=target,
index=index)
# 4. SAVE ATTEMPT 1b: switch based on TEST Saving mode
# s1 = Use integratedframe &
# gspread_dataframe & set_with_dataframe => set_remote
# This is not commment out code, it is annotation
if action == 's1' and debug is False:
saving = integratedframe.astype(str)
set_remote(worksheet=sheet,
dataframe=saving,
row=1,
col=1,
include_index=False,
include_column_header=True,
resize=False,
allow_formulas=False,
string_escaping='default')
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2065-2075)
# 5. SAVE ATTEMPT 2: Commit the updated DataFrame to remote
# Use the integratedframe as a overwrit: sheet.update
# Using https://docs.gspread.org/en/latest/user-guide.html#using-gspread-with-pandas
elif action == 's2' and debug is False:
sheet.update([saving.columns.values.tolist()] \
+ saving.values.tolist())
# Apparently, this is not working as documented.
# Expected type 'list' (matched generic type 'list[_T]')
# , got 'object' instead
pass
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2077-2084)
# 6. SAVE ATTEMPT 3: Inject the updated series into the remote
# source, via the series row and index parameters.
# BY matching on the Position column, primary key
elif action == 's3' and debug is True:
self.injection(series=series,
sheet=sheet,
row=index,
debug=debug)
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2110-2128)
@staticmethod
def integrate(single: pd.DataFrame,
source: pd.DataFrame,
index: int,
reset=True,
debug: bool = False) -> pd.DataFrame:
""" Merges the source and target DataFrames"""
# merge the single-row DataFrame into
# the source DataFrame at the same index
# https://www.perplexity.ai/search/2a256f73-47dc-4117-bbab-87e4c0a7cbe1?s=c
# drop any NaN values from the single-row DataFrame
single = single.dropna()
if debug is True:
rich.inspect(single)
return pd.concat([source.iloc[:index],
single,
source.iloc[index:]]).reset_index(drop=reset)
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2130-2136)
@staticmethod
def convertto(lt):
""" Converts the list of lists to a list of strings without nan vals
"""
return [[str(element) for element in sublist if str(element) != 'nan'] for sublist in lt]
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2089-2108)
@staticmethod
def injection(series: pd.Series, sheet, row: int, debug: bool = False) -> None:
""" Injects the updated series into the remote source, via the row"""
records = sheet.get_all_records()
row_id = None
for record in records:
if record['Position'] == series['Position']:
row_id = int(record[row])
break
rich.inspect(row_id)
# Update the row with the values from the series
if row_id is not None:
values = [series[key] for key in series.keys()]
if debug is True:
rich.inspect(values)
if debug is False:
sheet.update(str(row_id), values)
Created from JetBrains using CodeStream