Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure Storage Sample Automation #39944

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

Yionse
Copy link
Member

@Yionse Yionse commented Mar 5, 2025

By changing the current Storage samples, can enable them to run automatically in the pipeline.

For the latest pipeline results: https://dev.azure.com/azure-sdk/internal/_build/results?buildId=4713055&view=results

The supported packages are as follows:

  • azure-storage-blob
  • azure-storage-file-datalake
  • azure-storage-file-share
  • azure-storage-queue

The specific modifications are as follows:

  • Skipped azure-storage-blob/blob_samples_proxy_configuration.py because it uses a proxy and is not suitable for running on pipelines.

  • Change the environment variable name in samples to be consistent with that exported in test-resources.json.

  • Change the path of the local files used in the samples to an absolute address.
    Update

    SOURCE_FOLDER = "./sample-blobs/"
    

    to

    current_dir = os.path.dirname(os.path.abspath(__file__))
    SOURCE_FOLDER = os.path.join(current_dir, "./sample-blobs/")
    
  • Comment out the following code snippet in blob_samples_container_access_policy.py, as setting the allowBlobPublicAccess attribute in test-resources.json is unstable and may not always take effect.

    container_client.set_container_access_policy(signed_identifiers=identifiers, public_access=public_access)
    
  • Create required resources and add output in test-resources.json.

@xiangyan99 for notification.

@github-actions github-actions bot added the Storage Storage Service (Queues, Blobs, Files) label Mar 5, 2025
@Yionse Yionse force-pushed the feat/automation_storage branch 2 times, most recently from 000080f to 6e560df Compare March 5, 2025 03:23
@azure-sdk
Copy link
Collaborator

API change check

API changes are not detected in this pull request.

@Yionse Yionse force-pushed the feat/automation_storage branch 3 times, most recently from ecb648c to e3c413c Compare March 6, 2025 02:53
@Yionse Yionse force-pushed the feat/automation_storage branch 2 times, most recently from d79c18d to 91e72c1 Compare March 7, 2025 06:47
@Yionse Yionse force-pushed the feat/automation_storage branch from 4cae2c5 to 0a3561d Compare March 14, 2025 02:10
@Yionse Yionse marked this pull request as ready for review March 14, 2025 09:25
@Copilot Copilot bot review requested due to automatic review settings March 14, 2025 09:25
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR automates Azure Storage samples for pipeline execution by standardizing environment variable names, adjusting local file paths to absolute paths, and commenting out unstable code sections.

  • Environment variables have been renamed from AZURE_STORAGE_* to STORAGE_* in most samples.
  • File paths have been converted from relative to absolute paths in several samples.
  • Unstable container access policy code has been commented out in async samples.

Reviewed Changes

Copilot reviewed 67 out of 67 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
sdk/storage/azure-storage-blob/samples/blob_samples_client_side_encryption_keyvault.py Updated env var names and added key generation logic for client-side encryption.
sdk/storage/azure-storage-blob/samples/blob_samples_enumerate_blobs_async.py Renamed environment variable and updated container name for async enumeration.
sdk/storage/azure-storage-blob/samples/blob_samples_enumerate_blobs.py Renamed environment variable and updated container name for blob enumeration.
sdk/storage/azure-storage-blob/samples/blob_samples_directory_interface*.py Renamed environment variable and hardcoded container name for directory interface samples.
scripts/devops_tasks/test_run_samples.py Added proxy configuration sample to the list of skipped samples.
sdk/storage/azure-storage-blob/samples/blob_samples_container_access_policy_async.py Renamed environment variable and commented out unstable access policy setting.
sdk/storage/azure-storage-blob/samples/blob_samples_copy_blob*.py Updated env var names and container names; added container creation call.
sdk/storage/azure-storage-blob/samples/blob_samples_containers*.py Renamed environment variable, updated file path for source file, and changed container names.
sdk/storage/azure-storage-blob/samples/blob_samples_authentication* Updated environment variable names for connection string and account keys.
sdk/storage/azure-storage-blob/samples/blob_samples_client_side_encryption.py Updated environment variable name in sample header and connection string retrieval.
sdk/storage/azure-storage-blob/samples/blob_samples_common*.py Renamed env var and updated local file path for sample source file.
Comments suppressed due to low confidence (2)

sdk/storage/azure-storage-blob/samples/blob_samples_common.py:32

  • The environment variable used here is 'STORAGE_CONNECTION_STRING_SOFT', which is inconsistent with other samples that reference 'STORAGE_CONNECTION_STRING'. Please unify the variable naming.
connection_string = os.getenv("STORAGE_CONNECTION_STRING_SOFT")

sdk/storage/azure-storage-blob/samples/blob_samples_common_async.py:31

  • The use of 'STORAGE_CONNECTION_STRING_SOFT' here is inconsistent with other parts of the code that expect 'STORAGE_CONNECTION_STRING'. Consider using the same environment variable name across all samples.
connection_string = os.getenv("STORAGE_CONNECTION_STRING_SOFT")

@ChenxiJiang333 ChenxiJiang333 force-pushed the feat/automation_storage branch from 2767729 to b6292f7 Compare March 18, 2025 04:25
@@ -64,7 +63,7 @@ async def get_and_set_container_access_policy():
public_access = PublicAccess.CONTAINER

# Set the access policy on the container
await container_client.set_container_access_policy(signed_identifiers=identifiers, public_access=public_access)
# await container_client.set_container_access_policy(signed_identifiers=identifiers, public_access=public_access)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Is this is not useful, remove. If this useful but can't be executed, then put extra context to explain to the customer why he may want to uncomment it

Copy link
Member

@ChenxiJiang333 ChenxiJiang333 Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmazuel The storage account we created using arm template is somehow unstable (we tried to set its property allowblobpublicaccess as true, yet sometimes it works, sometimes it doesn't follow the template), that would cause failure to this sample. The sample itself does not have any issue, so is that ok to not make any changes on it but just to skip it in the pipeline?
f5a69f4#diff-f960dc3f6bb80a3355f294b17a15868dc7721a421cee036bf9f8cf2f4136bc58R158

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChenxiJiang333 we have public blob disabled by policy in our test subscriptions which could be what you are seeing, unless you use our TME based subscription for testing: https://dev.azure.com/azure-sdk/internal/_wiki/wikis/internal.wiki/206/Subscription-and-Tenant-Usage?anchor=azure-sdk-test-resources-tme

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe when they ran the sample in test pipeline, it was running under TME? @benbp

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blob Public Access is off by default for new accounts and is not turned on explicitly in our ARM template accounts because it is not allowed in Storage's test subscription. (We use TME for Live tests where it would be permitted but not for local testing).

I am okay if we leave the sample as-is and just ignore the samples that rely on public access as we do not want to enable that in our ARM template.

@v-xuto v-xuto requested review from lmazuel and mccoyp March 24, 2025 05:31
@ChenxiJiang333 ChenxiJiang333 force-pushed the feat/automation_storage branch 2 times, most recently from e8d8727 to ff171e3 Compare March 24, 2025 07:25
@ChenxiJiang333 ChenxiJiang333 force-pushed the feat/automation_storage branch from ff171e3 to f5a69f4 Compare March 24, 2025 07:28
"blob_samples_proxy_configuration.py",
"blob_samples_container_access_policy.py",
"blob_samples_container_access_policy_async.py"
],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some comments why we ignore these samples?

e.g. are they flaky code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proxy likely cannot be done in the pipeline, container access policies can be flakey but curious on what specifically is going wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is probably because of container public access? I am okay to ignore those.

Copy link
Member

@ChenxiJiang333 ChenxiJiang333 Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @xiangyan99, thanks for @jalauzon-msft's explanation, proxy cannot be done in the pipeline and somehow access policy samples are flakey possibly because of subscription policy. Should we put these comments into the code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should have anything commented in code. I think we should try and understand the flakiness and maybe address it, remove the sample entirely, or leave it disabled here.

Copy link
Member

@xiangyan99 xiangyan99 Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment in the PR is good enough. :)

"dependsOn": [
"[concat('Microsoft.Storage/storageAccounts/', variables('primaryAccountName'))]"
]
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to create containers in the ARM template? Could the samples create any containers they need to use so we can remove these?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have removed them from template and create containers in the samples. Thanks!

aiohttp>=3.0
azure-keyvault
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not add KeyVault as a dev dependency of Blob. Could we just ignore the one sample that uses KeyVault instead? That sample is for client-side encryption which is not a scenario many customers use and we do not encourage the use of it at all. I think it would be okay if that sample is ignored.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @xiangyan99, is that ok to ignore this sample?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Let's remove it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiangyan99 should we be clear in the sample that we don't encourage this scenario?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather delete it than leaving the code and adding comments this is discouraged.

Leaving it may confuse both customers and AI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the email thread, I would like to keep the sample, just disable it in test_run_samples.py so we can remove this dependency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with Jacob offline. @ChenxiJiang333 let's remove the keyvault dependency and disable the samples that rely on keyvault. (Keep the sample but not run them in the pipeline)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, updated the code.

KEYVAULT_URL = 'KEYVAULT_URL'
AZURE_CLIENT_ID = 'ACTIVE_DIRECTORY_APPLICATION_ID'
AZURE_CLIENT_SECRET = 'ACTIVE_DIRECTORY_APPLICATION_SECRET'
AZURE_TENANT_ID = 'ACTIVE_DIRECTORY_TENANT_ID'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep this sample, can we remove AZURE_CLIENT_ID, AZURE_CLIENT_SECRET, and AZURE_TENANT_ID? I don't think they are used in the sample at all. I think we missed cleaning these up a while ago.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has removed these unused codes in the sample, thanks!

Copy link
Member

@jalauzon-msft jalauzon-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! Thanks for your work on this.

I will start a run of the test pipeline on this PR and assuming that looks good, this should be good to go.

@jalauzon-msft
Copy link
Member

/azp run python - storage - tests

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants