Skip to content

[SPARK-19219][SQL] Fix Parquet log output defaults #16580

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

nicklavers
Copy link
Contributor

What changes were proposed in this pull request?

Changing the default parquet logging levels to reflect the changes made in PR #15538, in order to prevent the flood of log messages by default.

How was this patch tested?

Default log output when reading from parquet 1.6 files was compared with and without this change. The change eliminates the extraneous logging and makes the output readable.

@nicklavers
Copy link
Contributor Author

Link to this JIRA issue: https://issues.apache.org/jira/browse/SPARK-19219
Link to original issue: https://issues.apache.org/jira/browse/SPARK-17993

@srowen
Copy link
Member

srowen commented Jan 13, 2017

So the change here is really to turn up the log level for non-test code, not just the test code? it seems possibly reasonable but are there important warnings this would suppress for users?

@nicklavers
Copy link
Contributor Author

That's possible. We could narrow the scope a bit and instead just set log4j.logger.org.apache.parquet.CorruptStatistics=ERROR log4j.logger.parquet.CorruptStatistics=ERROR,
which also fixes the problem.

@SparkQA
Copy link

SparkQA commented Jan 13, 2017

Test build #71345 has finished for PR 16580 at commit cb80164.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jan 15, 2017

@nicklavers yeah narrowing the scope seems more conservative and desirable. Do the test configs need to be changed similarly to match your newer proposal?

@nicklavers nicklavers force-pushed the spark-19219-set_default_parquet_log_level branch from c7df9ac to cb80164 Compare January 16, 2017 17:36
@nicklavers
Copy link
Contributor Author

I don't see why not

@SparkQA
Copy link

SparkQA commented Jan 16, 2017

Test build #71459 has finished for PR 16580 at commit c7df9ac.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 16, 2017

Test build #71462 has finished for PR 16580 at commit 6788987.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 16, 2017

Test build #71460 has finished for PR 16580 at commit cb80164.

  • This patch fails from timeout after a configured wait of `250m`.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jan 17, 2017

Merged to master

@asfgit asfgit closed this in 0019005 Jan 17, 2017
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
## What changes were proposed in this pull request?

Changing the default parquet logging levels to reflect the changes made in PR [apache#15538](apache#15538), in order to prevent the flood of log messages by default.

## How was this patch tested?

Default log output when reading from parquet 1.6 files was compared with and without this change. The change eliminates the extraneous logging and makes the output readable.

Author: Nick Lavers <nick.lavers@videoamp.com>

Closes apache#16580 from nicklavers/spark-19219-set_default_parquet_log_level.
cmonkey pushed a commit to cmonkey/spark that referenced this pull request Feb 15, 2017
## What changes were proposed in this pull request?

Changing the default parquet logging levels to reflect the changes made in PR [apache#15538](apache#15538), in order to prevent the flood of log messages by default.

## How was this patch tested?

Default log output when reading from parquet 1.6 files was compared with and without this change. The change eliminates the extraneous logging and makes the output readable.

Author: Nick Lavers <nick.lavers@videoamp.com>

Closes apache#16580 from nicklavers/spark-19219-set_default_parquet_log_level.
@mallman mallman deleted the spark-19219-set_default_parquet_log_level branch March 24, 2017 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants