Skip to content

Commit 7a8e55a

Browse files
authored
Update README.md
1 parent c0f40a3 commit 7a8e55a

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,11 @@ This project shows how to use SPARK as Cloud-based SQL Engine and expose your bi
55
Traditional relational Database engines like SQL had scalability problems and so evolved couple of SQL-on-Hadoop frameworks like Hive, Cloudier Impala, Presto etc. These frameworks are essentially cloud-based solutions and they all come with their own advantages and limitations. This project will demo how SparkSQL comes across as one more SQL-on-Hadoop framework as listed below:
66
- Data from multiple sources can be pushed into Spark and then exposed as SQLtable
77
- These tables are then made accessible as a JDBC/ODBC data source via the Spark thrift server.
8-
- Multiple clients like ```Beeline CLI```, ```JDBC```, ```ODBC``` or ```BI tools like Tableau``` connect to Spark thrift server. - Once the connection is established, ThriftServer will contact SparkSQL engine to access Hive or Spark temp tables and run the sql queries on ApacheSpark framework.
8+
- Multiple clients like ```Beeline CLI```, ```JDBC```, ```ODBC``` or ```BI tools like Tableau``` connect to Spark thrift server.
9+
- Once the connection is established, ThriftServer will contact ```SparkSQL engine to access Hive or Spark temp tables and run the sql queries on ApacheSpark framework```.
910
- Spark Thrift basically works similar to HiveServer2 thrift where HiveServer2 submits the sql queries as Hive MapReduce job vs Spark thrift server will use Spark SQL engine which underline uses full spark capabilities.
1011

11-
#### Complete Guide - To know more about this topic, please refer to my blog [here](https://spoddutur.github.io/spark-notes/spark-as-cloud-based-sql-engine-via-thrift-server) where I briefed the concept in detail.
12+
#### To know more about this topic, please refer to my blog [here](https://spoddutur.github.io/spark-notes/spark-as-cloud-based-sql-engine-via-thrift-server) where I briefed the concept in detail.
1213

1314
### Architecture
1415
Following picture illustrates the idea we discussed above:

0 commit comments

Comments
 (0)