Skip to content

configure_azure_monitor() takes abnormally long time #34902

Open

Description

  • Package Name: azure-monitor-opentelemetry
  • Package Version: 1.3.0
  • Operating System: MacOS
  • Python Version: 3.10.13

Describe the bug

With default configuration (absence of configuration), time of execution of configure_azure_monitor() takes abnormally long time: ~10 seconds.

To Reproduce

long.py:

import time
from azure.monitor.opentelemetry import configure_azure_monitor

start = time.monotonic()
configure_azure_monitor()
print(time.monotonic() - start)

APPLICATIONINSIGHTS_CONNECTION_STRING="..." python long.py

Expected behavior

Reasonable time to configure (< 1 s).

Additional context

After running in debugger I discovered two main code places contributing to the delay, and both are related to checking the fact of running in an Azure VM.

  1. Resource detection.

Location: https://github.com/open-telemetry/opentelemetry-python-contrib/blob/37aba928d45713842941c7efc992726a79ea7d8a/resource/opentelemetry-resource-detector-azure/src/opentelemetry/resource/detector/azure/vm.py#L77

The way code gets there:

image

Then in https://github.com/open-telemetry/opentelemetry-python-contrib/blob/main/resource/opentelemetry-resource-detector-azure/src/opentelemetry/resource/detector/azure/vm.py

image

2. Statsbeat metrics

Location:

request_url = "{0}?{1}&{2}".format(
_AIMS_URI, _AIMS_API_VERSION, _AIMS_FORMAT)
response = requests.get(
request_url, headers={"MetaData": "True"}, timeout=5.0)

Call stack:

image

In both cases the delay is related to requests to this endpoint:

http://169.254.169.254/metadata/instance/compute
though, to different API versions. The first place has request timeout of 4 seconds, and the second place has 5 seconds, which together constitute almost the entire time of the startup delay.

Workarounds

  1. Exclude Azure resource detectors with help of setting OTEL_EXPERIMENTAL_RESOURCE_DETECTORS=otel environment variable. If not set, the library sets the default value, that includes App Service and Azure VM.
  2. Disable sending statsbeat using APPLICATIONINSIGHTS_STATSBEAT_DISABLED_ALL=TRUE

The above tweaks bring the configuration time down to ~0.8 s (and with OTEL_PYTHON_DISABLED_INSTRUMENTATIONS set to azure_sdk,django,fastapi,flask,psycopg2,requests,urllib,urllib3 it completes under 30 ms).

It took me hours to find the above options for fixing the startup time without touching the code. I think we need to make the library friendlier to running in non-Azure environments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

ClientThis issue points to a problem in the data-plane of the library.Monitor - ExporterMonitor OpenTelemetry ExporterService AttentionWorkflow: This issue is responsible by Azure service team.customer-reportedIssues that are reported by GitHub users external to the Azure organization.feature-requestThis issue requires a new behavior in the product in order be resolved.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamquestionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions