Skip to content

Commit b73d5ff

Browse files
OopsOutOfMemorymarmbrus
authored andcommitted
[SQL][Hiveconsole] Bring hive console code up to date and update README.md
Add `import org.apache.spark.sql.Dsl._` to make DSL query works. Since queryExecution is not avaliable in DataFrame, so remove it. Author: OopsOutOfMemory <victorshengli@126.com> Author: Sheng, Li <OopsOutOfMemory@users.noreply.github.com> Closes #4330 from OopsOutOfMemory/hiveconsole and squashes the following commits: 46eb790 [Sheng, Li] Update SparkBuild.scala d23ee9f [OopsOutOfMemory] minor d4dd593 [OopsOutOfMemory] refine hive console
1 parent 417d111 commit b73d5ff

File tree

2 files changed

+11
-35
lines changed

2 files changed

+11
-35
lines changed

project/SparkBuild.scala

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -245,6 +245,7 @@ object SQL {
245245
|import org.apache.spark.sql.catalyst.plans.logical._
246246
|import org.apache.spark.sql.catalyst.rules._
247247
|import org.apache.spark.sql.catalyst.util._
248+
|import org.apache.spark.sql.Dsl._
248249
|import org.apache.spark.sql.execution
249250
|import org.apache.spark.sql.test.TestSQLContext._
250251
|import org.apache.spark.sql.types._

sql/README.md

Lines changed: 10 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -22,59 +22,34 @@ export HADOOP_HOME="<path to>/hadoop-1.0.4"
2222

2323
Using the console
2424
=================
25-
An interactive scala console can be invoked by running `build/sbt hive/console`. From here you can execute queries and inspect the various stages of query optimization.
25+
An interactive scala console can be invoked by running `build/sbt hive/console`.
26+
From here you can execute queries with HiveQl and manipulate DataFrame by using DSL.
2627

2728
```scala
2829
catalyst$ build/sbt hive/console
2930

3031
[info] Starting scala interpreter...
31-
import org.apache.spark.sql.catalyst.analysis._
32-
import org.apache.spark.sql.catalyst.dsl._
33-
import org.apache.spark.sql.catalyst.errors._
34-
import org.apache.spark.sql.catalyst.expressions._
35-
import org.apache.spark.sql.catalyst.plans.logical._
36-
import org.apache.spark.sql.catalyst.rules._
37-
import org.apache.spark.sql.catalyst.util._
38-
import org.apache.spark.sql.execution
32+
import org.apache.spark.sql.Dsl._
3933
import org.apache.spark.sql.hive._
40-
import org.apache.spark.sql.hive.TestHive._
34+
import org.apache.spark.sql.hive.test.TestHive._
4135
import org.apache.spark.sql.types._
36+
import org.apache.spark.sql.parquet.ParquetTestData
4237
Welcome to Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_45).
4338
Type in expressions to have them evaluated.
4439
Type :help for more information.
4540

4641
scala> val query = sql("SELECT * FROM (SELECT * FROM src) a")
47-
query: org.apache.spark.sql.DataFrame =
48-
== Query Plan ==
49-
== Physical Plan ==
50-
HiveTableScan [key#10,value#11], (MetastoreRelation default, src, None), None
42+
query: org.apache.spark.sql.DataFrame = org.apache.spark.sql.DataFrame@74448eed
5143
```
5244

53-
Query results are RDDs and can be operated as such.
45+
Query results are `DataFrames` and can be operated as such.
5446
```
5547
scala> query.collect()
5648
res2: Array[org.apache.spark.sql.Row] = Array([238,val_238], [86,val_86], [311,val_311], [27,val_27]...
5749
```
5850

59-
You can also build further queries on top of these RDDs using the query DSL.
51+
You can also build further queries on top of these `DataFrames` using the query DSL.
6052
```
61-
scala> query.where('key === 100).collect()
62-
res3: Array[org.apache.spark.sql.Row] = Array([100,val_100], [100,val_100])
63-
```
64-
65-
From the console you can even write rules that transform query plans. For example, the above query has redundant project operators that aren't doing anything. This redundancy can be eliminated using the `transform` function that is available on all [`TreeNode`](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala) objects.
66-
```scala
67-
scala> query.queryExecution.analyzed
68-
res4: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan =
69-
Project [key#10,value#11]
70-
Project [key#10,value#11]
71-
MetastoreRelation default, src, None
72-
73-
74-
scala> query.queryExecution.analyzed transform {
75-
| case Project(projectList, child) if projectList == child.output => child
76-
| }
77-
res5: res17: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan =
78-
Project [key#10,value#11]
79-
MetastoreRelation default, src, None
53+
scala> query.where('key > 30).select(avg('key)).collect()
54+
res3: Array[org.apache.spark.sql.Row] = Array([274.79025423728814])
8055
```

0 commit comments

Comments
 (0)