Description
Description
SparkHiveDataset does not allow external hive tables at the moment. External tables are often encountered when the org database is outside hive and the table needs to be hosted in hive. More info available on : https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/using-hiveql/content/hive_create_an_external_table.html
Context
This will broaden the scope for hive datasets. Write now ant externally managed hive dataset needs to be referenced via a custom dataset and this happens quite often
Possible Implementation
Implementation is super simple. User needs to specify the keyword "External" in the DDL and specify a path for the table schema. Both can be tactically managed/input via catalog. Basis this input , the dataset should internally be able to decide the next course of actions and load/save data accordingly
Possible Alternatives
Accessing Hive table via HQL (but this again requires a HiveQueryDataSet (custom) ) which can access the metastore and query (bit slow)
Activity