Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE] migrate-tables DBUtilsCore.mounts() is not whitelisted on class error #718

Open
maruppel opened this issue Aug 5, 2024 · 7 comments

Comments

@maruppel
Copy link

maruppel commented Aug 5, 2024

Description
When running migrate-tables workflow a Py4J Security Exception is thrown on all migrate tasks.

Reproduction
When running the migrate-tables workflow from Databricks UI, all migrate tasks dbfs-root/non-delta/external/views fail with the error below.

Expected behavior
The migrate-tables workflow succeeds migrating tables/views.

Is it a regression?
Have been running these workflows in multiple workspaces typically with UCX version 0.27.1 have not seen the error before. Current running version is 0.28.2.

Debug Logs
16:46:22 DEBUG [databricks] {MainThread} Task crash details Traceback (most recent call last): File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/runtime.py", line 100, in trigger current_task(ctx) File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/hive_metastore/workflows.py", line 29, in migrate_dbfs_root_delta_tables ctx.tables_migrator.migrate_tables( File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/hive_metastore/table_migrate.py", line 80, in migrate_tables all_principal_grants = None if acl_strategy is None else self._principal_grants.get_interactive_cluster_grants() File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/hive_metastore/grants.py", line 557, in get_interactive_cluster_grants mounts = list(self._mounts_crawler.snapshot()) File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/hive_metastore/locations.py", line 252, in snapshot return self._snapshot(self._try_fetch, self._list_mounts) File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/framework/crawlers.py", line 116, in _snapshot loaded_records = list(loader()) File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/hive_metastore/locations.py", line 247, in _list_mounts for mount_point, source, _ in self._dbutils.fs.mounts(): File "/databricks/python_shell/dbruntime/dbutils.py", line 362, in f_with_exception_handling return f(*args, **kwargs) File "/databricks/python_shell/dbruntime/dbutils.py", line 497, in mounts self.print_return(self.dbcore.mounts()), MountInfo.create_from_jschema) File "/databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1355, in __call__ return_value = get_return_value( File "/databricks/spark/python/pyspark/errors/exceptions/captured.py", line 224, in deco return f(*a, **kw) File "/databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py", line 330, in get_return_value raise Py4JError( py4j.protocol.Py4JError: An error occurred while calling o432.mounts. Trace: py4j.security.Py4JSecurityException: Method public com.databricks.backend.daemon.dbutils.DBUtilsCore$Result com.databricks.backend.daemon.dbutils.DBUtilsCore.mounts() is not whitelisted on class class com.databricks.backend.daemon.dbutils.DBUtilsCore at py4j.security.WhitelistingPy4JSecurityManager.checkCall(WhitelistingPy4JSecurityManager.java:473) at py4j.Gateway.invoke(Gateway.java:305) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199) at py4j.ClientServerConnection.run(ClientServerConnection.java:119) at java.lang.Thread.run(Thread.java:750)

Other Information

  • OS: Windows
  • Version: 10
    Additional context
    UCX version 0.28.2
    DBCLI version 0.221.1
@JCZuurmond
Copy link

Moving this to ucx: databrickslabs/ucx#2498

@JCZuurmond
Copy link

@maruppel : What Databricks runtime is the migrate-tables workflow failling for?

@maruppel
Copy link
Author

@maruppel : What Databricks runtime is the migrate-tables workflow failling for?

DBR is 15.3, also this is after assessment has been run, which was a question in the other issue.

@JCZuurmond
Copy link

JCZuurmond commented Aug 29, 2024

@maruppel : Thank you for reporting back. I'll tack the question of the assessment being run in the other issue. Will cover the mounts whitelist issue here:

I tried to reproduce the error using the following (shortened) code path from ucx:

from databricks.sdk import WorkspaceClient
ws = WorkspaceClient()
ws.dbutils.fs.mounts()

You mentioned that the workflow worked before with ucx versoin 0.27.1 and not anymore with version 0.28.2. The difference in sdk dependency is:

  • Ucx 0.27.1 -> databricks-sdk>=0.27,<0.29
  • Ucx 0.28.2 -> databricks-sdk~=0.29.0 (note that there is only one patch version for sdk 0.29.0)

When I run the above code snippet after installing the sdk 0.29.0 on a cluster with DBR 15.3 in AWS (see complete configuration below), I do not receive the same whitelist error.

%pip install databricks-sdk~=0.29.0
dbutils.library.restartPython()
Cluster configuration { "cluster_id": "REDACTED", "creator_user_name": "REDACTED", "driver": { "private_ip": "REDACTED", "node_id": "REDACTED", "instance_id": "i-REDACTED", "start_timestamp": 1724916806770, "node_aws_attributes": { "is_spot": false }, "node_attributes": { "is_spot": false }, "host_private_ip": "REDACTED" }, "spark_context_id": 1513542293054547000, "driver_healthy": true, "jdbc_port": 10000, "cluster_name": "cor-test-cluster", "spark_version": "15.3.x-scala2.12", "spark_conf": { "spark.master": "local[*, 4]", "spark.databricks.cluster.profile": "singleNode" }, "aws_attributes": { "first_on_demand": 1, "availability": "SPOT_WITH_FALLBACK", "zone_id": "auto", "spot_bid_price_percent": 100, "ebs_volume_count": 0 }, "node_type_id": "r6id.large", "driver_node_type_id": "r6id.large", "custom_tags": { "ResourceClass": "SingleNode" }, "autotermination_minutes": 30, "enable_elastic_disk": true, "disk_spec": { "disk_count": 0 }, "cluster_source": "UI", "single_user_name": "REDACTED", "enable_local_disk_encryption": false, "instance_source": { "node_type_id": "r6id.large" }, "driver_instance_source": { "node_type_id": "r6id.large" }, "data_security_mode": "SINGLE_USER", "runtime_engine": "STANDARD", "effective_spark_version": "15.3.x-scala2.12", "state": "RUNNING", "state_message": "", "start_time": 1724852122131, "last_state_loss_time": 1724916949310, "last_activity_time": 1724916983193, "last_restarted_time": 1724916949353, "num_workers": 0, "cluster_memory_mb": 16384, "cluster_cores": 2, "default_tags": { "Vendor": "Databricks", "Creator": "REDACTED", "ClusterName": "cor-test-cluster", "ClusterId": "REDACTED", "Budget": "opex.sales.labs", "Owner": "REDACTED" }, "init_scripts_safe_mode": false, "spec": { "cluster_name": "cor-test-cluster", "spark_version": "15.3.x-scala2.12", "spark_conf": { "spark.master": "local[*, 4]", "spark.databricks.cluster.profile": "singleNode" }, "aws_attributes": { "first_on_demand": 1, "availability": "SPOT_WITH_FALLBACK", "zone_id": "auto", "spot_bid_price_percent": 100, "ebs_volume_count": 0 }, "node_type_id": "r6id.large", "driver_node_type_id": "r6id.large", "custom_tags": { "ResourceClass": "SingleNode" }, "autotermination_minutes": 30, "enable_elastic_disk": true, "single_user_name": "REDACTED", "enable_local_disk_encryption": false, "data_security_mode": "SINGLE_USER", "runtime_engine": "STANDARD", "effective_spark_version": "14.3.x-scala2.12", "num_workers": 0, "apply_policy_default_values": false } }

Could you try the above code snippets in your environment and report back if it recreations the issue for you?

And, does the issue persist with the latest ucx version?

@JCZuurmond
Copy link

Note that you can verify the installed sdk version:

from databricks.sdk.version import __version__
print(__version__)

@maruppel
Copy link
Author

Running the above code I am getting the same whitelist error, sdk=0.29.0

@JCZuurmond
Copy link

Oke, thank you. If you have an idea what could cause the whitelist error, please share. It sounds like some network restrictions are causing this issue.

Otherwise, I will leave it to the sdk team to resolve this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants