Skip to content

[SPARK-5789][SQL]Throw a better error message if JsonRDD.parseJson encounters unrecoverable parsing errors. #4582

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

yhuai
Copy link
Contributor

@yhuai yhuai commented Feb 13, 2015

No description provided.

…span multiple records (lines for files or strings for an RDD of strings).
@SparkQA
Copy link

SparkQA commented Feb 13, 2015

Test build #27415 has started for PR 4582 at commit 1466256.

  • This patch merges cleanly.

case _ =>
sys.error(
s"Failed to parse record $record. Please make sure that " +
"every record in the dataset is a JSON object or an array of JSON objects.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say something like each line of the file is a valid json object or array since its not obvious to non-hadoop people that this is how records might be split.

@SparkQA
Copy link

SparkQA commented Feb 13, 2015

Test build #27415 has finished for PR 4582 at commit 1466256.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27415/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Feb 13, 2015

Test build #27417 has started for PR 4582 at commit 152dbd4.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Feb 13, 2015

Test build #27417 has finished for PR 4582 at commit 152dbd4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27417/
Test PASSed.

@marmbrus
Copy link
Contributor

Thanks! Merging to master and 1.3

asfgit pushed a commit that referenced this pull request Feb 13, 2015
…counters unrecoverable parsing errors.

Author: Yin Huai <yhuai@databricks.com>

Closes #4582 from yhuai/jsonErrorMessage and squashes the following commits:

152dbd4 [Yin Huai] Update error message.
1466256 [Yin Huai] Throw a better error message when a JSON object in the input dataset span multiple records (lines for files or strings for an RDD of strings).

(cherry picked from commit 2e0c084)
Signed-off-by: Michael Armbrust <michael@databricks.com>
@asfgit asfgit closed this in 2e0c084 Feb 13, 2015
@yhuai yhuai deleted the jsonErrorMessage branch February 13, 2015 22:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants