You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-33786][SQL] The storage level for a cache should be respected when a table name is altered
### What changes were proposed in this pull request?
This PR proposes to retain the cache's storage level when a table name is altered by `ALTER TABLE ... RENAME TO ...`.
### Why are the changes needed?
Currently, when a table name is altered, the table's cache is refreshed (if exists), but the storage level is not retained. For example:
```scala
def getStorageLevel(tableName: String): StorageLevel = {
val table = spark.table(tableName)
val cachedData = spark.sharedState.cacheManager.lookupCachedData(table).get
cachedData.cachedRepresentation.cacheBuilder.storageLevel
}
Seq(1 -> "a").toDF("i", "j").write.parquet(path.getCanonicalPath)
sql(s"CREATE TABLE old USING parquet LOCATION '${path.toURI}'")
sql("CACHE TABLE old OPTIONS('storageLevel' 'MEMORY_ONLY')")
val oldStorageLevel = getStorageLevel("old")
sql("ALTER TABLE old RENAME TO new")
val newStorageLevel = getStorageLevel("new")
```
`oldStorageLevel` will be `StorageLevel(memory, deserialized, 1 replicas)` whereas `newStorageLevel` will be `StorageLevel(disk, memory, deserialized, 1 replicas)`, which is the default storage level.
### Does this PR introduce _any_ user-facing change?
Yes, now the storage level for the cache will be retained.
### How was this patch tested?
Added a unit test.
Closes#30774 from imback82/alter_table_rename_cache_fix.
Authored-by: Terry Kim <yuminkim@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit ef7f690)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
0 commit comments