Skip to content

[MINOR][DOCS] Fix invalid documentation for StreamingQueryManager Class #24547

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

asaf400
Copy link
Contributor

@asaf400 asaf400 commented May 7, 2019

What changes were proposed in this pull request?

When following the example for using spark.streams().awaitAnyTermination()
a valid pyspark code will output the following error:

Traceback (most recent call last):
  File "pyspark_app.py", line 182, in <module>
    spark.streams().awaitAnyTermination()
TypeError: 'StreamingQueryManager' object is not callable

Docs URL: https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#managing-streaming-queries

This changes the documentation line to properly call the method under the StreamingQueryManager Class
https://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.streaming.StreamingQueryManager

How was this patch tested?

After changing the syntax, error no longer occurs and pyspark application works

This is only docs change

When following the example for using `spark.streams().awaitAnyTermination()` a valid pyspark code will output the following error:

```Traceback (most recent call last):
  File "pyspark_app.py", line 182, in <module>
    spark.streams().awaitAnyTermination()
TypeError: 'StreamingQueryManager' object is not callable```

This changes the documentation line to properly call the method under the StreamingQueryManager Class
https://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.streaming.StreamingQueryManager
Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you see any other instances like this?

@srowen srowen changed the title Fix invalid documentation for StreamingQueryManager Class [MINOR][DOCS] Fix invalid documentation for StreamingQueryManager Class May 7, 2019
@SparkQA
Copy link

SparkQA commented May 7, 2019

Test build #4778 has finished for PR 24547 at commit e322c1c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -2554,11 +2554,11 @@ spark.streams().awaitAnyTermination(); // block until any one of them terminat
{% highlight python %}
spark = ... # spark session
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I think we should make this as a self-contained example .. that's maybe the root cause that we have not-working examples. We can do it separately later.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quickly double checked. Let's fix sparkSession.streams() in the doc -> sparkSession.streams. Looks good otherwise

@asaf400
Copy link
Contributor Author

asaf400 commented May 8, 2019

@srowen for sure, There are more places in the spark docs that have not-working examples, that I have encountered, personally.

I believe @HyukjinKwon is right, as I was searching for pyspark knowledge resources, databricks saved me many times, with a full working example, somewhat self-contained (enough to learn from the example and implement on my code without actually 'running' the example)

I'm currently in the process of building a spark cluster for data science infrastructure and data lake,
during learning spark and pyspark I have come across numerous documentation problems, where spark docs would say do this, but that didn't work, and google helped with results of do this, like that - full example or same spark example but with correct syntax, results were mainly from databricks, and other bloggers (medium, etc..)

As a first time user starting out with spark, which looks like (and is) a mature program,
the devil is in the documentation, meaning a not-working example leads to loss of confidence in the program..

This is just the first time I actually make a PR about something that I found was wrong in official docs, I'll try to be more alert, and post more PR's as I re-encounter invalid docs..

@HyukjinKwon
Copy link
Member

HyukjinKwon commented May 8, 2019

Merged to master, branch-2.4 and branch-2.3.

HyukjinKwon pushed a commit that referenced this pull request May 8, 2019
## What changes were proposed in this pull request?

When following the example for using `spark.streams().awaitAnyTermination()`
a valid pyspark code will output the following error:

```
Traceback (most recent call last):
  File "pyspark_app.py", line 182, in <module>
    spark.streams().awaitAnyTermination()
TypeError: 'StreamingQueryManager' object is not callable
```

Docs URL: https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#managing-streaming-queries

This changes the documentation line to properly call the method under the StreamingQueryManager Class
https://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.streaming.StreamingQueryManager

## How was this patch tested?

After changing the syntax, error no longer occurs and pyspark application works

This is only docs change

Closes #24547 from asaf400/patch-1.

Authored-by: Asaf Levy <asaf400@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit 09422f5)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
HyukjinKwon pushed a commit that referenced this pull request May 8, 2019
## What changes were proposed in this pull request?

When following the example for using `spark.streams().awaitAnyTermination()`
a valid pyspark code will output the following error:

```
Traceback (most recent call last):
  File "pyspark_app.py", line 182, in <module>
    spark.streams().awaitAnyTermination()
TypeError: 'StreamingQueryManager' object is not callable
```

Docs URL: https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#managing-streaming-queries

This changes the documentation line to properly call the method under the StreamingQueryManager Class
https://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.streaming.StreamingQueryManager

## How was this patch tested?

After changing the syntax, error no longer occurs and pyspark application works

This is only docs change

Closes #24547 from asaf400/patch-1.

Authored-by: Asaf Levy <asaf400@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit 09422f5)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
@asaf400 asaf400 deleted the patch-1 branch May 13, 2019 12:08
rluta pushed a commit to rluta/spark that referenced this pull request Sep 17, 2019
## What changes were proposed in this pull request?

When following the example for using `spark.streams().awaitAnyTermination()`
a valid pyspark code will output the following error:

```
Traceback (most recent call last):
  File "pyspark_app.py", line 182, in <module>
    spark.streams().awaitAnyTermination()
TypeError: 'StreamingQueryManager' object is not callable
```

Docs URL: https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#managing-streaming-queries

This changes the documentation line to properly call the method under the StreamingQueryManager Class
https://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.streaming.StreamingQueryManager

## How was this patch tested?

After changing the syntax, error no longer occurs and pyspark application works

This is only docs change

Closes apache#24547 from asaf400/patch-1.

Authored-by: Asaf Levy <asaf400@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit 09422f5)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Sep 26, 2019
## What changes were proposed in this pull request?

When following the example for using `spark.streams().awaitAnyTermination()`
a valid pyspark code will output the following error:

```
Traceback (most recent call last):
  File "pyspark_app.py", line 182, in <module>
    spark.streams().awaitAnyTermination()
TypeError: 'StreamingQueryManager' object is not callable
```

Docs URL: https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#managing-streaming-queries

This changes the documentation line to properly call the method under the StreamingQueryManager Class
https://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.streaming.StreamingQueryManager

## How was this patch tested?

After changing the syntax, error no longer occurs and pyspark application works

This is only docs change

Closes apache#24547 from asaf400/patch-1.

Authored-by: Asaf Levy <asaf400@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit 09422f5)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants