Skip to content

Commit

Permalink
Add the optional max_parallelism parameter to the PowerBi class
Browse files Browse the repository at this point in the history
  • Loading branch information
RadekBuczkowski committed Aug 29, 2024
1 parent ce29709 commit f169ddf
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 2 deletions.
14 changes: 12 additions & 2 deletions docs/power_bi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -267,6 +267,11 @@ The "number_of_retries" parameter only works with enhanced API requests
be ignored otherwise.
Default is 0 (no retries). E.g. 1 means two attempts in total.

Additionally, you can set the optional "max_parallelism" parameter to
set the maximum number of threads that can run processing commands
in parallel during the refresh in PowerBI. Default is None, which
corresponds to 10 threads (according to Microsoft).

All parameters can only be specified in the constructor.


Expand Down Expand Up @@ -322,6 +327,11 @@ and unlike in the start_refresh() method, it will work both with normal
refreshes (i.e. when "table_names" is not specified) and with enhanced
refreshes (i.e. when "table_names" is specified).

Additionally, you can set the optional "max_parallelism" parameter to
set the maximum number of threads that can run processing commands
in parallel during the refresh in PowerBI. Default is None, which
corresponds to 10 threads (according to Microsoft).

You can also specify the optional "local_timezone_name" parameter to
show the last refresh time of the PowerBI dataset in a local time zone.
It is only used for printing timestamps. The default time zone is UTC.
Expand Down Expand Up @@ -421,12 +431,12 @@ OnDemand | The refresh was triggered interactively through the Power BI portal.
OnDemandTraining | The refresh was triggered interactively through the Power BI portal with automatic aggregations training.
Scheduled | The refresh was triggered by a dataset refresh schedule setting.
ViaApi | The refresh was triggered by an API call, e.g. by using this class without the "table_names" parameter specified.
ViaEnhancedApi | The refresh was triggered by an enhanced API call, e.g. by using this class with the "table_names" parameter specified.
ViaEnhancedApi | The refresh was triggered by an enhanced API call, e.g. by using this class with the "table_names" or "max_parallelism" parameter specified.
ViaXmlaEndpoint | The refresh was triggered through Power BI public XMLA endpoint.

Only "ViaApi" and "ViaEnhancedApi" refreshes can be triggered by this class.
"ViaApi" are refreshes without the "table_names" parameter specified,
and "ViaEnhancedApi" are refreshes with the "table_names" parameter specified.
and "ViaEnhancedApi" are refreshes with the "table_names" or "max_parallelism" parameter specified.

To see what tables were specified with each completed refresh marked as
"ViaEnhancedApi", you can use the show_history_details() and get_history_details()
Expand Down
13 changes: 13 additions & 0 deletions src/spetlr/power_bi/PowerBi.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ def __init__(
max_minutes_after_last_refresh: int = 12 * 60,
timeout_in_seconds: int = 15 * 60,
number_of_retries: int = 0,
max_parallelism: int = None,
mail_on_failure: bool = False,
mail_on_completion: bool = False,
exclude_creators: List[str] = None,
Expand Down Expand Up @@ -60,6 +61,10 @@ def __init__(
errors when calling refresh() and start_refresh().
Default is 0 (no retries). (E.g. 1 means two attempts in total.)
Used only when the timeout_in_seconds parameter allows it!
:param int max_parallelism: Specifies the maximum number of threads
that can run processing commands in parallel during a refresh
in PowerBI. Default is None, which corresponds to 10 threads.
Used only with the refresh() and start_refresh() methods!
:param str local_timezone_name: The timezone to use when parsing
timestamp columns. The default timezone is UTC.
If the timezone is UTC, all timestamp columns will have a suffix "Utc".
Expand Down Expand Up @@ -94,6 +99,10 @@ def __init__(
"The 'number_of_retries' parameter "
"must be greater than or equal zero!"
)
if max_parallelism is not None and max_parallelism < 0:
raise ValueError(
"The 'max_parallelism' parameter must be greater than or equal zero!"
)

if (mail_on_failure or mail_on_completion) and table_names is not None:
raise ValueError(
Expand All @@ -110,6 +119,7 @@ def __init__(
self.max_minutes_after_last_refresh = max_minutes_after_last_refresh
self.timeout_in_seconds = timeout_in_seconds
self.number_of_retries = number_of_retries
self.max_parallelism = max_parallelism
self.mail_on_failure = mail_on_failure
self.mail_on_completion = mail_on_completion
self.exclude_creators = (
Expand Down Expand Up @@ -965,6 +975,9 @@ def _get_refresh_argument_json(
"The 'number_of_retries' parameter is ignored in "
"start_refresh() if 'table_names' is not specified!"
)
if self.max_parallelism is not None:
result["maxParallelism"] = self.max_parallelism

return result if result else None

def _trigger_new_refresh(self, *, with_wait: bool = True) -> bool:
Expand Down

0 comments on commit f169ddf

Please sign in to comment.