Skip to content

[MINOR][SQL][DOCS] Add notes of the deterministic assumption on UDF functions #13087

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions python/pyspark/sql/functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -1756,6 +1756,9 @@ def __call__(self, *cols):
@since(1.3)
def udf(f, returnType=StringType()):
"""Creates a :class:`Column` expression representing a user defined function (UDF).
Note that the user-defined functions must be deterministic. Due to optimization,
duplicate invocations may be eliminated or the function may even be invoked more times than
it is present in the query.

>>> from pyspark.sql.types import IntegerType
>>> slen = udf(lambda s: len(s), IntegerType())
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import org.apache.spark.sql.types.DataType

/**
* User-defined function.
* Note that the user-defined functions must be deterministic.
* @param function The user defined scala function to run.
* Note that if you use primitive parameters, you are not able to check if it is
* null or not, and the UDF will return null for you if the primitive input is
Expand Down
3 changes: 3 additions & 0 deletions sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,9 @@ class SQLContext private[sql](

/**
* A collection of methods for registering user-defined functions (UDF).
* Note that the user-defined functions must be deterministic. Due to optimization,
* duplicate invocations may be eliminated or the function may even be invoked more times than
* it is present in the query.
*
* The following example registers a Scala closure as UDF:
* {{{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,9 @@ class SparkSession private(

/**
* A collection of methods for registering user-defined functions (UDF).
* Note that the user-defined functions must be deterministic. Due to optimization,
* duplicate invocations may be eliminated or the function may even be invoked more times than
* it is present in the query.
*
* The following example registers a Scala closure as UDF:
* {{{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ import org.apache.spark.sql.types.DataType

/**
* Functions for registering user-defined functions. Use [[SQLContext.udf]] to access this.
* Note that the user-defined functions must be deterministic.
*
* @since 1.3.0
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ import org.apache.spark.sql.types.DataType

/**
* A user-defined function. To create one, use the `udf` functions in [[functions]].
* Note that the user-defined functions must be deterministic. Due to optimization,
* duplicate invocations may be eliminated or the function may even be invoked more times than
* it is present in the query.
* As an example:
* {{{
* // Defined a UDF that returns true or false based on some numeric score.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ private[sql] class SessionState(sparkSession: SparkSession) {

/**
* Interface exposed to the user for registering user-defined functions.
* Note that the user-defined functions must be deterministic.
*/
lazy val udf: UDFRegistration = new UDFRegistration(functionRegistry)

Expand Down