Skip to content

[SNAP-3165] Instantiating snappy session only when catalogImplementation #191

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 6, 2019

Conversation

vatsalmevada
Copy link

@vatsalmevada vatsalmevada commented Dec 4, 2019

is in-memory which running pyspark shell.

What changes were proposed in this pull request?

We are initializing SparkSession as well as SnappySession while starting pyspark shell.
SparkSession and SparkContextwere always initialized with hive support enable
irrespective of value of spark.sql.catalogImplementation config.

With these changes, we are checking the value of spark.sql.catalogImplementation and
hive support is not enabled when the value of above-mentioned property is set to
in-memory explicitly.

SnappySession will be only initialized when catalog implementation is set to in-memory
to avoid failure reported in SNAP-3165.

Later we can provide support for hive catalog implementation for python with SnappySession.

How was this patch tested?

manual

Following scenarios are tested:

  • Created external table from pyspark shell and checked that it is visible
    on the embedded cluster UI.

>>> snappy.sql(" create external table airline using parquet options(path '/path/to/airline_withId_210M');")

  • Restarted pyspark shell and was able to access the earlier created external
    table

>>> snappy.sql("select * from airline limit 10").show()

  • Created column table from pyspark and checked that it is visible
    on the embedded cluster UI.
>>> snappy.sql("create table snappy_table(id int, name string) using column") 
>>> snappy.sql("insert into snappy_table values(1,'abc')")
  • Restarted pyspark and was able to access column table

>>> snappy.sql("select * from snappy_table").show()

  • Started spark-shell and was able to access external table as well as column
    table

Copy link

@suranjan suranjan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
We need to document this behaviour.

@vatsalmevada vatsalmevada merged commit 8700297 into snappy/branch-2.1 Dec 6, 2019
@vatsalmevada vatsalmevada deleted the SNAP-3165 branch December 6, 2019 09:32
sumwale pushed a commit to sumwale/spark that referenced this pull request Nov 5, 2020
…ion (TIBCOSoftware#191)

is in-memory which running pyspark shell.

We are initializing `SparkSession` as well as `SnappySession` while starting pyspark shell.
`SparkSession` and `SparkContext`were always initialized with hive support enable
 irrespective of value of `spark.sql.catalogImplementation` config.

With these changes, we are checking the value of `spark.sql.catalogImplementation` and
hive support is not enabled when the value of above-mentioned property is set to
 `in-memory` explicitly.

SnappySession will be only initialized when catalog implementation is set to `in-memory`
to avoid failure reported in SNAP-3165.

Later we can provide support for hive catalog implementation for python with SnappySession.
sumwale pushed a commit that referenced this pull request Jul 11, 2021
…ion (#191)

is in-memory which running pyspark shell.

We are initializing `SparkSession` as well as `SnappySession` while starting pyspark shell.
`SparkSession` and `SparkContext`were always initialized with hive support enable
 irrespective of value of `spark.sql.catalogImplementation` config.

With these changes, we are checking the value of `spark.sql.catalogImplementation` and
hive support is not enabled when the value of above-mentioned property is set to
 `in-memory` explicitly.

SnappySession will be only initialized when catalog implementation is set to `in-memory`
to avoid failure reported in SNAP-3165.

Later we can provide support for hive catalog implementation for python with SnappySession.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants