Skip to content

Conversation

@suii2210
Copy link
Contributor

This PR fixes an issue in PowerBIHook where relative Power BI REST API paths (for example, myorg/groups) were passed directly to the async request adapter.
In Apache Airflow 3.x, this resulted in the following error during execution:

Request URL is missing an http:// or https:// protocol

Root Cause

PowerBIHook.run() forwarded relative endpoints directly to KiotaRequestAdapterHook.run().
The async request adapter requires a fully qualified URL, causing failures when standard Power BI endpoints were used.

Solution

Normalize Power BI REST API endpoints inside PowerBIHook.run()
Prepend the official Power BI REST API base URL for relative paths:
https://api.powerbi.com/v1.0/
Preserve absolute URLs without modification
Keep all existing Power BI operators unchanged

Testing
Added unit tests to verify relative Power BI URLs are expanded correctly
Added unit tests to ensure absolute URLs remain unchanged
Existing Power BI hook tests continue to pass

Backward Compatibility

No breaking changes introduced

Compatible with both sync and async execution paths

Related Issue

Fixes #60573

@suii2210 suii2210 force-pushed the fix-powerbi-relative-url branch from 87a4b76 to 220f7cf Compare January 17, 2026 15:31
@dabla
Copy link
Contributor

dabla commented Jan 17, 2026

This is weird, as I would expect the host url defined in your PowerBI connection to be automatically prepended to the relative url. The PowerBIHook extends the KiotaRequestAdapterHook, thus I would expect to behave in the same way.

Also, why did you fully replace the code in the hook and test, this makes it very difficult to review?

Could you post an example DAG how it's used and connection example?

@suii2210 suii2210 closed this Jan 17, 2026
@dabla
Copy link
Contributor

dabla commented Jan 17, 2026

As I already suspected, the PowerBIHook already hard-codes the host, so normally the relative url should be resolved correctly:

class PowerBIHook(KiotaRequestAdapterHook):
    """
    A async hook to interact with Power BI.

    :param conn_id: The connection Id to connect to PowerBI.
    :param timeout: The HTTP timeout being used by the `KiotaRequestAdapter` (default is None).
        When no timeout is specified or set to None then there is no HTTP timeout on each request.
    :param proxies: A dict defining the HTTP proxies to be used (default is None).
    :param api_version: The API version of the Microsoft Graph API to be used (default is v1).
        You can pass an enum named APIVersion which has 2 possible members v1 and beta,
        or you can pass a string as `v1.0` or `beta`.
    """

    conn_type: str = "powerbi"
    conn_name_attr: str = "conn_id"
    default_conn_name: str = "powerbi_default"
    hook_name: str = "Power BI"

    def __init__(
        self,
        conn_id: str = default_conn_name,
        proxies: dict | None = None,
        timeout: float = 60 * 60 * 24 * 7,
        api_version: APIVersion | str | None = None,
    ):
        super().__init__(
            conn_id=conn_id,
            proxies=proxies,
            timeout=timeout,
            host="https://api.powerbi.com",
            scopes=["https://analysis.windows.net/powerbi/api/.default"],
            api_version=api_version,
        )

@dabla
Copy link
Contributor

dabla commented Jan 17, 2026

This is weird, as I would expect the host url defined in your PowerBI connection to be automatically prepended to the relative url. The PowerBIHook extends the KiotaRequestAdapterHook, thus I would expect to behave in the same way.
Also, why did you fully replace the code in the hook and test, this makes it very difficult to review?
Could you post an example DAG how it's used and connection example?

Regarding the URL handling: I had the same expectation initially, however in practice PowerBIHook.run() ends up passing relative paths (e.g. myorg/groups) directly to the async request adapter, which then fails with:

Request URL is missing an 'http://' or 'https://' protocol

This suggests the host configured in the hook is not being prepended for these calls. The issue is reproducible with a standard Power BI connection and occurs consistently when calling get_workspace_list() and other Power BI hook methods. The actual logic change in this PR is intentionally minimal: it only normalizes relative Power BI REST endpoints before delegating to super().run(). Absolute URLs are left untouched. The larger diff is due to formatter and import reordering enforced by CI after rebasing; no additional behavior beyond URL normalization was changed.

The relative url is automatically concatenated with the host by the RequestAdapter's RequestInformation class.

@suii2210 suii2210 reopened this Jan 17, 2026
@suii2210
Copy link
Contributor Author

This is weird, as I would expect the host url defined in your PowerBI connection to be automatically prepended to the relative url. The PowerBIHook extends the KiotaRequestAdapterHook, thus I would expect to behave in the same way.
Also, why did you fully replace the code in the hook and test, this makes it very difficult to review?
Could you post an example DAG how it's used and connection example?

Regarding the URL handling: I had the same expectation initially, however in practice PowerBIHook.run() ends up passing relative paths (e.g. myorg/groups) directly to the async request adapter, which then fails with:
Request URL is missing an 'http://' or 'https://' protocol
This suggests the host configured in the hook is not being prepended for these calls. The issue is reproducible with a standard Power BI connection and occurs consistently when calling get_workspace_list() and other Power BI hook methods. The actual logic change in this PR is intentionally minimal: it only normalizes relative Power BI REST endpoints before delegating to super().run(). Absolute URLs are left untouched. The larger diff is due to formatter and import reordering enforced by CI after rebasing; no additional behavior beyond URL normalization was changed.

The relative url is automatically concatenated with the host by the RequestAdapter's RequestInformation class.
Thanks for the clarification - that makes sense. I agree that by design the RequestInformation should combine the relative path with the configured host, and that was my initial assumption as well.

What I’m observing in practice (Airflow 3.x, async execution path) is that PowerBIHook.run() forwards relative endpoints like myorg/groups to the async adapter, and the request fails before the URL is fully resolved, with:

Request URL is missing an 'http://' or 'https://' protocol

This happens consistently when calling methods such as get_workspace_list() using a standard Power BI connection. That’s what led me to believe that, at least in this async path, the URL normalization is not being applied as expected.
I agree that adding normalization in PowerBIHook may not be the right long-term solution if the base KiotaRequestAdapterHook is expected to handle this. If you think this should instead be addressed (or investigated) at the KiotaRequestAdapterHook / request adapter level, I’m happy to follow up in that direction.
Regarding the diff size: the functional change itself is limited to URL normalization. The larger diff is due to prek/ruff enforcing import order and formatting after rebasing; no additional behavior was intentionally changed.
Below is a minimal example of how the hook is used where the issue reproduces.

from airflow import DAG
from airflow.providers.microsoft.azure.hooks.powerbi import PowerBIHook
from airflow.operators.python import PythonOperator
from datetime import datetime


def list_workspaces():
    hook = PowerBIHook(conn_id="powerbi_default")
    return hook.get_workspace_list()


with DAG(
    dag_id="powerbi_workspace_list_example",
    start_date=datetime(2024, 1, 1),
    schedule=None,
    catchup=False,
) as dag:
    PythonOperator(
        task_id="list_powerbi_workspaces",
        python_callable=list_workspaces,
    )

@dabla
Copy link
Contributor

dabla commented Jan 17, 2026

This is weird, as I would expect the host url defined in your PowerBI connection to be automatically prepended to the relative url. The PowerBIHook extends the KiotaRequestAdapterHook, thus I would expect to behave in the same way.
Also, why did you fully replace the code in the hook and test, this makes it very difficult to review?
Could you post an example DAG how it's used and connection example?

Regarding the URL handling: I had the same expectation initially, however in practice PowerBIHook.run() ends up passing relative paths (e.g. myorg/groups) directly to the async request adapter, which then fails with:
Request URL is missing an 'http://' or 'https://' protocol
This suggests the host configured in the hook is not being prepended for these calls. The issue is reproducible with a standard Power BI connection and occurs consistently when calling get_workspace_list() and other Power BI hook methods. The actual logic change in this PR is intentionally minimal: it only normalizes relative Power BI REST endpoints before delegating to super().run(). Absolute URLs are left untouched. The larger diff is due to formatter and import reordering enforced by CI after rebasing; no additional behavior beyond URL normalization was changed.

The relative url is automatically concatenated with the host by the RequestAdapter's RequestInformation class.
Thanks for the clarification - that makes sense. I agree that by design the RequestInformation should combine the relative path with the configured host, and that was my initial assumption as well.

What I’m observing in practice (Airflow 3.x, async execution path) is that PowerBIHook.run() forwards relative endpoints like myorg/groups to the async adapter, and the request fails before the URL is fully resolved, with:

Request URL is missing an 'http://' or 'https://' protocol

This happens consistently when calling methods such as get_workspace_list() using a standard Power BI connection. That’s what led me to believe that, at least in this async path, the URL normalization is not being applied as expected. I agree that adding normalization in PowerBIHook may not be the right long-term solution if the base KiotaRequestAdapterHook is expected to handle this. If you think this should instead be addressed (or investigated) at the KiotaRequestAdapterHook / request adapter level, I’m happy to follow up in that direction. Regarding the diff size: the functional change itself is limited to URL normalization. The larger diff is due to prek/ruff enforcing import order and formatting after rebasing; no additional behavior was intentionally changed. Below is a minimal example of how the hook is used where the issue reproduces.

from airflow import DAG
from airflow.providers.microsoft.azure.hooks.powerbi import PowerBIHook
from airflow.operators.python import PythonOperator
from datetime import datetime


def list_workspaces():
    hook = PowerBIHook(conn_id="powerbi_default")
    return hook.get_workspace_list()


with DAG(
    dag_id="powerbi_workspace_list_example",
    start_date=datetime(2024, 1, 1),
    schedule=None,
    catchup=False,
) as dag:
    PythonOperator(
        task_id="list_powerbi_workspaces",
        python_callable=list_workspaces,
    )

Ok now I understand why it isn't working, you'are using async code in the PythonOperator, this is not possible yet until Airflow 3.2 is released.

@potiuk Here also a nice example where the async PythonOperator will be helpful for people wanting to interact directly with the hooks.

I would advise you to use the PowerBIOperator for this, as this will handle the async code for you in deferred mode behind the scenes, or you have to write you method like this:

def list_workspaces():
     import asyncio
     
     hook = PowerBIHook(conn_id="powerbi_default")
     return asyncio.run(hook.get_workspace_list())

From Airflow 3.2+ you'll be able to do this:

@task
async def list_workspaces():
     hook = PowerBIHook(conn_id="powerbi_default")
     return await hook.get_workspace_list()

This explains why you are encountering issues, this example won't work as the KiotaRequestAdapterHook and PowerBIHook are asynchronous hooks, just like the HttpAsyncHook.

@suii2210
Copy link
Contributor Author

suii2210 commented Jan 17, 2026

The issue is not related to URL handling in PowerBIHook. It occurs because PowerBIHook (and KiotaRequestAdapterHook) are asynchronous hooks, and they were being called from a synchronous PythonOperator, which is not supported until Airflow 3.2.
Using PowerBIOperator (deferred mode) or wrapping the call with asyncio.run() works correctly, and no changes are required in the provider code itself.

@suii2210
Copy link
Contributor Author

This is weird, as I would expect the host url defined in your PowerBI connection to be automatically prepended to the relative url. The PowerBIHook extends the KiotaRequestAdapterHook, thus I would expect to behave in the same way.
Also, why did you fully replace the code in the hook and test, this makes it very difficult to review?
Could you post an example DAG how it's used and connection example?

Regarding the URL handling: I had the same expectation initially, however in practice PowerBIHook.run() ends up passing relative paths (e.g. myorg/groups) directly to the async request adapter, which then fails with:
Request URL is missing an 'http://' or 'https://' protocol
This suggests the host configured in the hook is not being prepended for these calls. The issue is reproducible with a standard Power BI connection and occurs consistently when calling get_workspace_list() and other Power BI hook methods. The actual logic change in this PR is intentionally minimal: it only normalizes relative Power BI REST endpoints before delegating to super().run(). Absolute URLs are left untouched. The larger diff is due to formatter and import reordering enforced by CI after rebasing; no additional behavior beyond URL normalization was changed.

The relative url is automatically concatenated with the host by the RequestAdapter's RequestInformation class.
Thanks for the clarification - that makes sense. I agree that by design the RequestInformation should combine the relative path with the configured host, and that was my initial assumption as well.

What I’m observing in practice (Airflow 3.x, async execution path) is that PowerBIHook.run() forwards relative endpoints like myorg/groups to the async adapter, and the request fails before the URL is fully resolved, with:
Request URL is missing an 'http://' or 'https://' protocol
This happens consistently when calling methods such as get_workspace_list() using a standard Power BI connection. That’s what led me to believe that, at least in this async path, the URL normalization is not being applied as expected. I agree that adding normalization in PowerBIHook may not be the right long-term solution if the base KiotaRequestAdapterHook is expected to handle this. If you think this should instead be addressed (or investigated) at the KiotaRequestAdapterHook / request adapter level, I’m happy to follow up in that direction. Regarding the diff size: the functional change itself is limited to URL normalization. The larger diff is due to prek/ruff enforcing import order and formatting after rebasing; no additional behavior was intentionally changed. Below is a minimal example of how the hook is used where the issue reproduces.

from airflow import DAG
from airflow.providers.microsoft.azure.hooks.powerbi import PowerBIHook
from airflow.operators.python import PythonOperator
from datetime import datetime


def list_workspaces():
    hook = PowerBIHook(conn_id="powerbi_default")
    return hook.get_workspace_list()


with DAG(
    dag_id="powerbi_workspace_list_example",
    start_date=datetime(2024, 1, 1),
    schedule=None,
    catchup=False,
) as dag:
    PythonOperator(
        task_id="list_powerbi_workspaces",
        python_callable=list_workspaces,
    )

Ok now I understand why it isn't working, you'are using async code in the PythonOperator, this is not possible yet until Airflow 3.2 is released.

@potiuk Here also a nice example where the async PythonOperator will be helpful for people wanting to interact directly with the hooks.

I would advise you to use the PowerBIOperator for this, as this will handle the async code for you in deferred mode behind the scenes, or you have to write you method like this:

def list_workspaces():
     import asyncio
     
     hook = PowerBIHook(conn_id="powerbi_default")
     return asyncio.run(hook.get_workspace_list())

From Airflow 3.2+ you'll be able to do this:

@task
async def list_workspaces():
     hook = PowerBIHook(conn_id="powerbi_default")
     return await hook.get_workspace_list()

This explains why you are encountering issues, this example won't work as the KiotaRequestAdapterHook and PowerBIHook are asynchronous hooks, just like the HttpAsyncHook.

After further investigation, this turns out not to be an issue in PowerBIHook or the URL handling logic itself. The hook already correctly configures the Power BI host, and relative URLs are expected to be resolved by the underlying Kiota RequestInformation.
The error (Request URL is missing an 'http://' or 'https://' protocol) occurs when the asynchronous PowerBIHook methods are invoked from a synchronous PythonOperator. In Airflow 3.1.x, async hooks cannot be awaited directly in PythonOperator, which causes the request to fail before the URL is fully resolved.

so I’m closing this PR. Thanks for the guidance.

@suii2210 suii2210 closed this Jan 17, 2026
@dabla
Copy link
Contributor

dabla commented Jan 17, 2026

@suii2210 Thank you for sharing the example DAG and explanation as this helped out understanding the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

apache-airflow-providers-microsoft-azure Request URL issue

2 participants