This project aims to read a hive table from a file (.txt) and create a text file (.txt) containing the table in Junit format for spark projects written in Scala language (dataFrame)
- Download fromHiveTableToScalaDataframe.exe from FromHiveTableToScalaJunitTest/bin/fromHiveTableToScalaDataframe.exe or scroll to the bottom of this page.
- create a txt file with
- table name
- a list[("field","dataType")]
- hive/impala query result in txt format (the fields has to be the same of the list input)
anagrafica [("id", "int"),("nome", "string")] +-------+------------+ |id |nome | +-------+------------+ |1 |pippo | |2 |pluto | |3 |gianni | |4 |carla | +-------+------------+
- Execute
/path/to/fromHiveTableToScalaDataframe.exe
- follow the procedure
- save the destination txt file
val anagrafica = List( (1, "pippo"), (2, "pluto"), (3, "gianni"), (4, "carla") ).toDF("id","nome")
- it is possible to add more files for different tables and the output will contain all the dataframes created.
- it does not accept all datatypes yet
- do not insert more than 22 fields for one table (Scala dataframe do not accept more than 22 columns)