Skip to content

Conversation

@shahar1
Copy link
Contributor

@shahar1 shahar1 commented Feb 11, 2023


replaces: #28142
closes: #29552
^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added area:providers provider:google Google (including GCP) related issues labels Feb 11, 2023
@eladkal eladkal changed the title Add folder_id param to upload_file GoogleDriveHook: Add folder_id param to upload_file Feb 11, 2023
@eladkal
Copy link
Contributor

eladkal commented Feb 15, 2023

@shahar1 can you check #29552 would be good to add the parameter to LocalFilesystemToGoogleDriveOperator

@aru-trackunit
Copy link
Contributor

aru-trackunit commented Feb 15, 2023

This comment is in regards to this issue:
#29552

@shahar1 I pulled your code and only added folder_id assignments

    file_upload = LocalFilesystemToGoogleDriveOperator(
        task_id="uploading-file"
        gcp_conn_id="gc-conn-id",
        drive_folder="Folder A",
        folder_id="Folder_A_ID",
        local_paths=[f"/tmp/file_to_upload.csv"]
    )

The good news is that file was uploaded to a shared drive, however path's are not matching what I meant.
I wanted the file to be uploaded to /Folder A/file.csv, but it resulted in /Folder_A_ID/Folder A/file.csv

That was the case when name of the folder_id was equal to drive_folder. But maybe I expected that because I knew what the behaviour was previously.

In order to get what I wanted I had to run this:

    file_upload = LocalFilesystemToGoogleDriveOperator(
        task_id="uploading-file"
        gcp_conn_id="gc-conn-id",
        drive_folder="/",
        folder_id="Folder_A_ID",
        local_paths=[f"/tmp/file_to_upload.csv"]
    )

so it solves the problem, but then logs message is quite misleading.

[2023-02-15, 11:25:52 UTC] {drive.py:238} INFO - File /tmp/file_to_upload.csv uploaded to gdrive:///file_to_upload.csv.

Would be nice to get folder_id real path. - Not sure if possible

In case this goes forward the documentation would have to be improved & maybe rename folder_id to parent_folder_id in this case and drive_folder to path. But if someone doesn't fill folder_id then it gets cumbersome.

When I run:

    file_upload = LocalFilesystemToGoogleDriveOperator(
        task_id="uploading-file"
        gcp_conn_id="gc-conn-id",
        drive_folder="Folder A",
        folder_id="root", #I could skip that param, but added to be clear.
        local_paths=[f"/tmp/test-file-csv.csv"]
    )

then it's placed on the user's own drive under Folder A - it's a correct behaviour

@eladkal
But yeah! Overall it's almost there!

lucasfcnunes and others added 2 commits February 17, 2023 18:22
@shahar1 shahar1 force-pushed the folder_id-when-upload_file branch from e0a0b09 to 1bccfbd Compare February 17, 2023 16:22
… folder_id-when-upload_file

# Conflicts:
#	airflow/providers/google/suite/hooks/drive.py
@shahar1 shahar1 force-pushed the folder_id-when-upload_file branch from 02cd44d to 0cb9ea6 Compare February 17, 2023 16:50
@shahar1
Copy link
Contributor Author

shahar1 commented Feb 17, 2023

This comment is in regards to this issue: #29552

@shahar1 I pulled your code and only added folder_id assignments

    file_upload = LocalFilesystemToGoogleDriveOperator(
        task_id="uploading-file"
        gcp_conn_id="gc-conn-id",
        drive_folder="Folder A",
        folder_id="Folder_A_ID",
        local_paths=[f"/tmp/file_to_upload.csv"]
    )

The good news is that file was uploaded to a shared drive, however path's are not matching what I meant. I wanted the file to be uploaded to /Folder A/file.csv, but it resulted in /Folder_A_ID/Folder A/file.csv

That was the case when name of the folder_id was equal to drive_folder. But maybe I expected that because I knew what the behaviour was previously.

In order to get what I wanted I had to run this:

    file_upload = LocalFilesystemToGoogleDriveOperator(
        task_id="uploading-file"
        gcp_conn_id="gc-conn-id",
        drive_folder="/",
        folder_id="Folder_A_ID",
        local_paths=[f"/tmp/file_to_upload.csv"]
    )

so it solves the problem, but then logs message is quite misleading.

[2023-02-15, 11:25:52 UTC] {drive.py:238} INFO - File /tmp/file_to_upload.csv uploaded to gdrive:///file_to_upload.csv.

Would be nice to get folder_id real path. - Not sure if possible

In case this goes forward the documentation would have to be improved & maybe rename folder_id to parent_folder_id in this case and drive_folder to path. But if someone doesn't fill folder_id then it gets cumbersome.

When I run:

    file_upload = LocalFilesystemToGoogleDriveOperator(
        task_id="uploading-file"
        gcp_conn_id="gc-conn-id",
        drive_folder="Folder A",
        folder_id="root", #I could skip that param, but added to be clear.
        local_paths=[f"/tmp/test-file-csv.csv"]
    )

then it's placed on the user's own drive under Folder A - it's a correct behaviour

@eladkal But yeah! Overall it's almost there!

Hey, thanks for the suggestion - I just added the folder_id as a parameter of the operator with a default 'root' value.

@shahar1 shahar1 force-pushed the folder_id-when-upload_file branch 3 times, most recently from 088224f to 8a4aa24 Compare February 18, 2023 15:50
@shahar1 shahar1 force-pushed the folder_id-when-upload_file branch from 8a4aa24 to 83641f1 Compare February 18, 2023 18:47
@eladkal eladkal merged commit f37772a into apache:main Feb 18, 2023
@shahar1 shahar1 deleted the folder_id-when-upload_file branch June 12, 2024 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:google Google (including GCP) related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Google provider doesn't let uploading file to a shared drive

5 participants