Skip to content

[SPARK-30653][INFRA][SQL] EOL character enforcement for java/scala/xml/py/R files #27365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

HeartSaVioR
Copy link
Contributor

@HeartSaVioR HeartSaVioR commented Jan 26, 2020

What changes were proposed in this pull request?

This patch converts CR/LF into LF in 3 source files, which most files are only using LF. This patch also add rules to enforce EOL as LF for all java, scala, xml, py, R files.

Why are the changes needed?

The majority of source code files are using LF and only three files are CR/LF. While using IDE would let us don't bother with the difference, it still has a chance to make unnecessary diff if the file is modified with the editor which doesn't handle it automatically.

Does this PR introduce any user-facing change?

No

How was this patch tested?

grep -IUrl --color "^M" . | grep "\.java\|\.scala\|\.xml\|\.py\|\.R" | grep -v "/target/" | grep -v "/build/" | grep -v "/dist/" | grep -v "dependency-reduced-pom.xml" | grep -v ".pyc"

(Please note you'll need to type CTRL+V -> CTRL+M in bash shell to get ^M because it's representing CR/LF, not a combination of ^ and M.)

Before the patch, the result is:

./sql/core/src/main/java/org/apache/spark/sql/execution/columnar/ColumnDictionary.java
./sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/complexTypesSuite.scala
./sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ComplexTypes.scala

and after the patch, the result is None.

And git shows WARNING message if EOL of any of source files in given types are modified to CR/LF, like below:

warning: CRLF will be replaced by LF in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala.
The file will have its original line endings in your working directory.

@HeartSaVioR
Copy link
Contributor Author

I think it's ideal to define the rules in .gitattributes, but would like to hear the voices before moving forward.

@srowen
Copy link
Member

srowen commented Jan 26, 2020

If it's just a few files, OK. I think it's also fine to have git enforce it. Is there any downside to that?

@SparkQA
Copy link

SparkQA commented Jan 26, 2020

Test build #117425 has finished for PR 27365 at commit 25d3e61.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • public final class ColumnDictionary implements Dictionary

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Jan 27, 2020

@HeartSaVioR . When I follow the direction in the PR description at the master branch, the result is different. Did I miss something?

$ git log --oneline -n1
43d9c7e7e5 (HEAD -> master, apache/master, apache/HEAD) [SPARK-30640][PYTHON][SQL] Prevent unnecessary copies of data during Arrow to Pandas conversion

$ grep -IUrl --color "^M" . | grep "\.java\|\.scala\|\.xml\|\.py\|\.R" | grep -v "/target/" | grep -v "/build/" | grep -v "/dist/" | grep -v "dependency-reduced-pom.xml" | grep -v ".pyc"
./python/pyspark/_globals.py
./python/pyspark/heapq3.py
./python/pyspark/mllib/linalg/__init__.py
./python/pyspark/shuffle.py
./python/pyspark/ml/linalg/__init__.py
./R/pkg/vignettes/sparkr-vignettes.Rmd
./examples/src/main/python/logistic_regression.py
./dev/github_jira_sync.py

@HeartSaVioR
Copy link
Contributor Author

@dongjoon-hyun ^M in the PR description is CR/LF, so you may want to type CTRL+V -> CTRL+M in bash shell to get it. I'll update the PR description.

@HeartSaVioR HeartSaVioR changed the title [MINOR][SQL] Convert CRLF into LF in source files [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL to LF in gitattributes Jan 27, 2020
@HeartSaVioR HeartSaVioR changed the title [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL to LF in gitattributes [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes Jan 27, 2020
@HeartSaVioR
Copy link
Contributor Author

HeartSaVioR commented Jan 27, 2020

I think it's also fine to have git enforce it. Is there any downside to that?

I don't think there's outstanding downside, as if it does have some considerable downsides someone should have been complained. Only 3 files were CR/LF and others have been LF. (cmd/bat files are enforced to have CR/LF as EOL, as they're only used in Windows OS.)

@SparkQA
Copy link

SparkQA commented Jan 27, 2020

Test build #117430 has finished for PR 27365 at commit a5d975c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun dongjoon-hyun changed the title [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes [SPARK-30653][SQL] EOL character enforcement for java/scala/xml/py/R files Jan 27, 2020
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-30653][SQL] EOL character enforcement for java/scala/xml/py/R files [SPARK-30653][INFRA][SQL] EOL character enforcement for java/scala/xml/py/R files Jan 27, 2020
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merged to master.
Thank you, @HeartSaVioR and @srowen .

@HeartSaVioR
Copy link
Contributor Author

Thanks all for reviewing amd merging! Thanks again @dongjoon-hyun to deal with filing JIRA and connecting this.

@HeartSaVioR HeartSaVioR deleted the MINOR-remove-CRLF-in-source-codes branch January 27, 2020 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants