Skip to content

Commit c706759

Browse files
feat: create deploy_remote_function and deploy_udf functions to immediately deploy functions to BigQuery (#1832)
* Refactor function deployment to avoid code duplication This commit refactors the implementation of immediate deployment for remote functions and UDFs to eliminate code duplication introduced in a previous commit. Changes: - The `remote_function` and `udf` methods in `bigframes.functions._function_session.FunctionSession` now accept an optional `deploy_immediately: bool` parameter (defaulting to `False`). The previous `deploy_remote_function` and `deploy_udf` methods in `FunctionSession` have been removed, and their logic is now incorporated into the unified methods. - The public API functions `bigframes.pandas.deploy_remote_function` and `bigframes.pandas.deploy_udf` now call the corresponding `FunctionSession` methods with `deploy_immediately=True`. - The public API functions `bigframes.pandas.remote_function` and `bigframes.pandas.udf` call the `FunctionSession` methods with `deploy_immediately=False` (relying on the default). - Unit tests in `tests/unit/functions/test_remote_function.py` have been updated to patch the unified `FunctionSession` methods and verify the correct `deploy_immediately` boolean is passed based on which public API function is called. Note: The underlying provisioning logic in `FunctionSession` currently deploys functions immediately regardless of the `deploy_immediately` flag. This flag serves as an indicator of intent and allows for future enhancements to support true lazy deployment if desired, without further API changes. * Refactor function deployment to use distinct methods This commit corrects a previous refactoring attempt to eliminate code duplication and properly separates immediate-deployment functions from standard (potentially lazy) functions. Changes: - `bigframes.functions._function_session.FunctionSession` now has distinct methods: `remote_function`, `udf`, `deploy_remote_function`, and `deploy_udf`. The `deploy_immediately` flag has been removed from this class. - `deploy_remote_function` and `deploy_udf` methods in `FunctionSession` are responsible for ensuring immediate deployment by calling the underlying provisioning logic directly. The standard `remote_function` and `udf` methods in `FunctionSession` also currently call this provisioning logic, meaning all functions are deployed immediately as of now, but the structure allows for future lazy evaluation for standard functions without changing the deploy variants' contract. - Public API functions in `bigframes.pandas` (`remote_function`, `udf`, `deploy_remote_function`, `deploy_udf`) now correctly delegate to their corresponding distinct methods in `FunctionSession` (via the `Session` object). - Unit tests in `tests/unit/functions/test_remote_function.py` have been updated to mock and verify calls to the correct distinct methods on `bigframes.session.Session`. This resolves the issue of using a boolean flag to control deployment type and instead relies on calling specific, dedicated methods for immediate deployment, aligning with your request. * Simplify internal deploy_remote_function and deploy_udf calls This commit simplifies the implementation of `deploy_remote_function` and `deploy_udf` within `bigframes.functions._function_session.FunctionSession`. Given that the standard `remote_function` and `udf` methods in `FunctionSession` already perform immediate deployment of resources (as the underlying provisioning logic they call is immediate), the `deploy_remote_function` and `deploy_udf` methods in the same class are simplified to directly call `self.remote_function(...)` and `self.udf(...)` respectively. This change makes the distinction between the `deploy_` variants and the standard variants in `FunctionSession` primarily a matter of semantic clarity and intent at that level; both paths currently result in immediate deployment. The public API in `bigframes.pandas` continues to offer distinct `deploy_` functions that call these `FunctionSession.deploy_` methods, preserving your user-facing API and its documented behavior of immediate deployment. No changes were needed for the public API in `bigframes.pandas` or the unit tests, as they were already aligned with calling distinct methods on the `Session` object, which in turn calls the now-simplified `FunctionSession` methods. * add tests and use kwargs * add missing func argument to bpd --------- Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
1 parent c06d8db commit c706759

File tree

5 files changed

+214
-4
lines changed

5 files changed

+214
-4
lines changed

bigframes/core/global_session.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -110,8 +110,8 @@ def get_global_session():
110110
_T = TypeVar("_T")
111111

112112

113-
def with_default_session(func: Callable[..., _T], *args, **kwargs) -> _T:
114-
return func(get_global_session(), *args, **kwargs)
113+
def with_default_session(func_: Callable[..., _T], *args, **kwargs) -> _T:
114+
return func_(get_global_session(), *args, **kwargs)
115115

116116

117117
class _GlobalSessionContext:

bigframes/functions/_function_session.py

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -668,6 +668,30 @@ def wrapper(func):
668668

669669
return wrapper
670670

671+
def deploy_remote_function(
672+
self,
673+
func,
674+
**kwargs,
675+
):
676+
"""Orchestrates the creation of a BigQuery remote function that deploys immediately.
677+
678+
This method ensures that the remote function is created and available for
679+
use in BigQuery as soon as this call is made.
680+
681+
Args:
682+
kwargs:
683+
All arguments are passed directly to
684+
:meth:`~bigframes.session.Session.remote_function`. Please see
685+
its docstring for parameter details.
686+
687+
Returns:
688+
A wrapped remote function, usable in
689+
:meth:`~bigframes.series.Series.apply`.
690+
"""
691+
# TODO(tswast): If we update remote_function to defer deployment, update
692+
# this method to deploy immediately.
693+
return self.remote_function(**kwargs)(func)
694+
671695
def udf(
672696
self,
673697
input_types: Union[None, type, Sequence[type]] = None,
@@ -866,6 +890,32 @@ def wrapper(func):
866890

867891
return wrapper
868892

893+
def deploy_udf(
894+
self,
895+
func,
896+
**kwargs,
897+
):
898+
"""Orchestrates the creation of a BigQuery UDF that deploys immediately.
899+
900+
This method ensures that the UDF is created and available for
901+
use in BigQuery as soon as this call is made.
902+
903+
Args:
904+
func:
905+
Function to deploy.
906+
kwargs:
907+
All arguments are passed directly to
908+
:meth:`~bigframes.session.Session.udf`. Please see
909+
its docstring for parameter details.
910+
911+
Returns:
912+
A wrapped Python user defined function, usable in
913+
:meth:`~bigframes.series.Series.apply`.
914+
"""
915+
# TODO(tswast): If we update udf to defer deployment, update this method
916+
# to deploy immediately.
917+
return self.udf(**kwargs)(func)
918+
869919

870920
def _convert_row_processor_sig(
871921
signature: inspect.Signature,

bigframes/pandas/__init__.py

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,22 @@ def remote_function(
117117
remote_function.__doc__ = inspect.getdoc(bigframes.session.Session.remote_function)
118118

119119

120+
def deploy_remote_function(
121+
func,
122+
**kwargs,
123+
):
124+
return global_session.with_default_session(
125+
bigframes.session.Session.deploy_remote_function,
126+
func=func,
127+
**kwargs,
128+
)
129+
130+
131+
deploy_remote_function.__doc__ = inspect.getdoc(
132+
bigframes.session.Session.deploy_remote_function
133+
)
134+
135+
120136
def udf(
121137
*,
122138
input_types: Union[None, type, Sequence[type]] = None,
@@ -140,6 +156,20 @@ def udf(
140156
udf.__doc__ = inspect.getdoc(bigframes.session.Session.udf)
141157

142158

159+
def deploy_udf(
160+
func,
161+
**kwargs,
162+
):
163+
return global_session.with_default_session(
164+
bigframes.session.Session.deploy_udf,
165+
func=func,
166+
**kwargs,
167+
)
168+
169+
170+
deploy_udf.__doc__ = inspect.getdoc(bigframes.session.Session.deploy_udf)
171+
172+
143173
@typing.overload
144174
def to_datetime(
145175
arg: Union[
@@ -330,6 +360,8 @@ def reset_session():
330360
clean_up_by_session_id,
331361
concat,
332362
cut,
363+
deploy_remote_function,
364+
deploy_udf,
333365
get_default_session_id,
334366
get_dummies,
335367
merge,

bigframes/session/__init__.py

Lines changed: 76 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1343,6 +1343,40 @@ def _check_file_size(self, filepath: str):
13431343
"for large files to avoid loading the file into local memory."
13441344
)
13451345

1346+
def deploy_remote_function(
1347+
self,
1348+
func,
1349+
**kwargs,
1350+
):
1351+
"""Orchestrates the creation of a BigQuery remote function that deploys immediately.
1352+
1353+
This method ensures that the remote function is created and available for
1354+
use in BigQuery as soon as this call is made.
1355+
1356+
Args:
1357+
func:
1358+
Function to deploy.
1359+
kwargs:
1360+
All arguments are passed directly to
1361+
:meth:`~bigframes.session.Session.remote_function`. Please see
1362+
its docstring for parameter details.
1363+
1364+
Returns:
1365+
A wrapped remote function, usable in
1366+
:meth:`~bigframes.series.Series.apply`.
1367+
"""
1368+
return self._function_session.deploy_remote_function(
1369+
func,
1370+
# Session-provided arguments.
1371+
session=self,
1372+
bigquery_client=self._clients_provider.bqclient,
1373+
bigquery_connection_client=self._clients_provider.bqconnectionclient,
1374+
cloud_functions_client=self._clients_provider.cloudfunctionsclient,
1375+
resource_manager_client=self._clients_provider.resourcemanagerclient,
1376+
# User-provided arguments.
1377+
**kwargs,
1378+
)
1379+
13461380
def remote_function(
13471381
self,
13481382
# Make sure that the input/output types, and dataset can be used
@@ -1565,9 +1599,15 @@ def remote_function(
15651599
`bigframes_remote_function` - The bigquery remote function capable of calling into `bigframes_cloud_function`.
15661600
"""
15671601
return self._function_session.remote_function(
1602+
# Session-provided arguments.
1603+
session=self,
1604+
bigquery_client=self._clients_provider.bqclient,
1605+
bigquery_connection_client=self._clients_provider.bqconnectionclient,
1606+
cloud_functions_client=self._clients_provider.cloudfunctionsclient,
1607+
resource_manager_client=self._clients_provider.resourcemanagerclient,
1608+
# User-provided arguments.
15681609
input_types=input_types,
15691610
output_type=output_type,
1570-
session=self,
15711611
dataset=dataset,
15721612
bigquery_connection=bigquery_connection,
15731613
reuse=reuse,
@@ -1585,6 +1625,37 @@ def remote_function(
15851625
cloud_build_service_account=cloud_build_service_account,
15861626
)
15871627

1628+
def deploy_udf(
1629+
self,
1630+
func,
1631+
**kwargs,
1632+
):
1633+
"""Orchestrates the creation of a BigQuery UDF that deploys immediately.
1634+
1635+
This method ensures that the UDF is created and available for
1636+
use in BigQuery as soon as this call is made.
1637+
1638+
Args:
1639+
func:
1640+
Function to deploy.
1641+
kwargs:
1642+
All arguments are passed directly to
1643+
:meth:`~bigframes.session.Session.udf`. Please see
1644+
its docstring for parameter details.
1645+
1646+
Returns:
1647+
A wrapped Python user defined function, usable in
1648+
:meth:`~bigframes.series.Series.apply`.
1649+
"""
1650+
return self._function_session.deploy_udf(
1651+
func,
1652+
# Session-provided arguments.
1653+
session=self,
1654+
bigquery_client=self._clients_provider.bqclient,
1655+
# User-provided arguments.
1656+
**kwargs,
1657+
)
1658+
15881659
def udf(
15891660
self,
15901661
*,
@@ -1726,9 +1797,12 @@ def udf(
17261797
deployed for the user defined code.
17271798
"""
17281799
return self._function_session.udf(
1800+
# Session-provided arguments.
1801+
session=self,
1802+
bigquery_client=self._clients_provider.bqclient,
1803+
# User-provided arguments.
17291804
input_types=input_types,
17301805
output_type=output_type,
1731-
session=self,
17321806
dataset=dataset,
17331807
bigquery_connection=bigquery_connection,
17341808
name=name,

tests/unit/functions/test_remote_function.py

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,3 +89,57 @@ def function_without_return_annotation(myparam: int):
8989
match="'output_type' was not set .* missing a return type annotation",
9090
):
9191
remote_function_decorator(function_without_return_annotation)
92+
93+
94+
def test_deploy_remote_function():
95+
session = mocks.create_bigquery_session()
96+
97+
def my_remote_func(x: int) -> int:
98+
return x * 2
99+
100+
deployed = session.deploy_remote_function(
101+
my_remote_func, cloud_function_service_account="test_sa@example.com"
102+
)
103+
104+
# Test that the function would have been deployed somewhere.
105+
assert deployed.bigframes_bigquery_function
106+
107+
108+
def test_deploy_remote_function_with_name():
109+
session = mocks.create_bigquery_session()
110+
111+
def my_remote_func(x: int) -> int:
112+
return x * 2
113+
114+
deployed = session.deploy_remote_function(
115+
my_remote_func,
116+
name="my_custom_name",
117+
cloud_function_service_account="test_sa@example.com",
118+
)
119+
120+
# Test that the function would have been deployed somewhere.
121+
assert "my_custom_name" in deployed.bigframes_bigquery_function
122+
123+
124+
def test_deploy_udf():
125+
session = mocks.create_bigquery_session()
126+
127+
def my_remote_func(x: int) -> int:
128+
return x * 2
129+
130+
deployed = session.deploy_udf(my_remote_func)
131+
132+
# Test that the function would have been deployed somewhere.
133+
assert deployed.bigframes_bigquery_function
134+
135+
136+
def test_deploy_udf_with_name():
137+
session = mocks.create_bigquery_session()
138+
139+
def my_remote_func(x: int) -> int:
140+
return x * 2
141+
142+
deployed = session.deploy_udf(my_remote_func, name="my_custom_name")
143+
144+
# Test that the function would have been deployed somewhere.
145+
assert "my_custom_name" in deployed.bigframes_bigquery_function

0 commit comments

Comments
 (0)