Skip to content

[SPARK-18422][CORE] Fix wholeTextFiles test to pass on Windows in JavaAPISuite #15866

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Nov 12, 2016

What changes were proposed in this pull request?

This PR fixes the test wholeTextFiles in JavaAPISuite.java. This is failed due to the different path format on Windows.

For example, the path in container was

C:\projects\spark\target\tmp\1478967560189-0/part-00000

whereas new URI(res._1()).getPath() was as below:

/C:/projects/spark/target/tmp/1478967560189-0/part-00000

How was this patch tested?

Tests in JavaAPISuite.java.

Tested via AppVeyor.

Before
Build: https://ci.appveyor.com/project/spark-test/spark/build/63-JavaAPISuite-1
Diff: master...spark-test:JavaAPISuite-1

[info] Test org.apache.spark.JavaAPISuite.wholeTextFiles started
[error] Test org.apache.spark.JavaAPISuite.wholeTextFiles failed: java.lang.AssertionError: expected:<spark is easy to use.
[error] > but was:<null>, took 0.578 sec
[error]     at org.apache.spark.JavaAPISuite.wholeTextFiles(JavaAPISuite.java:1089)
...

After
Build started: [CORE] org.apache.spark.JavaAPISuite PR-15866
Diff: master...spark-test:198DDA52-F201-4D2B-BE2F-244E0C1725B2

[info] Test org.apache.spark.JavaAPISuite.wholeTextFiles started
...

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Nov 12, 2016

@srowen It seems writeWithNewAPIHadoopFile in org.apache.spark.JavaAPISuite is also being failed but the root cause seems different. That is being failed with the exception as below:

java.io.IOException: Could not rename file:
...

And, other some failures specified in the parent JIRA of this seems caused by the problems in ReceiverTracker we discussed in #15618

Will try to make another PR soon for this.

@SparkQA
Copy link

SparkQA commented Nov 12, 2016

Test build #68569 has finished for PR 15866 at commit 9c258a5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with this, but if the point is to avoid hard-coding "/", is it sufficient to use the File(String, String) constructor? whatever works and is most consistent with other tests is OK by me. I know they're not that consistent now anyway.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Nov 14, 2016

Thank you Sean. Actually, this is a bit annoying.

Here is what happened in the original test.

  1. Writes a file to read back by wholeTextFiles.

    scala> new File("C:\\a", "/part-0000")
    res10: java.io.File = C:\a\part-0000
  2. and then saves the path in a map, container.

    scala> val container = Map("C:\\a/part-0000" -> "contents")
    container: scala.collection.immutable.Map[String,String] = Map(C:\a/part-0000 -> contents)
  3. then compares the key in container to the collected paths from wholeTextFiles. The path should be

    scala> val path = "file:/C:/a/part-0000"
    path: String = file:/C:/a/part-0000
    
    scala> val actualPath = new URI(path).getPath
    actualPath: String = /C:/a/part-0000
  4. It ends up with..

    scala> container.contains(actualPath)
    res4: Boolean = false

Just FWIW, (at least for myself to note this problem somewhere and re-visit sometime), it seems both back-slash and slash are fine with java.io.File on Windows.

scala> new File("C:/a")
res9: java.io.File = C:\a

scala> new File("C:\\a", "/part-0000")
res10: java.io.File = C:\a\part-0000

@HyukjinKwon
Copy link
Member Author

Let me add a comment here and will try to clean up more.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I trust your judgment about how best to fix it.

@HyukjinKwon
Copy link
Member Author

Build started: [CORE] org.apache.spark.JavaAPISuite PR-15866
Diff: master...spark-test:198DDA52-F201-4D2B-BE2F-244E0C1725B2

@SparkQA
Copy link

SparkQA commented Nov 14, 2016

Test build #68615 has finished for PR 15866 at commit e6b2a03.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Nov 15, 2016

Looks OK to me @HyukjinKwon but does the appveyor result in your comment above mean that it still fails on Windows?

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Nov 15, 2016

Oh, the test this PR tries to deal with, org.apache.spark.JavaAPISuite.wholeTextFiles is passed. However, another test in here org.apache.spark.JavaAPISuite.writeWithNewAPIHadoopFile is being failed for a different reason which I am investigating. (I just updated the PR description). This does not look like the same path problem assuming from my googling.

Before

[info] Test org.apache.spark.JavaAPISuite.wholeTextFiles started
[error] Test org.apache.spark.JavaAPISuite.wholeTextFiles failed: java.lang.AssertionError: expected:<spark is easy to use.
[error] > but was:<null>, took 0.578 sec
[error]     at org.apache.spark.JavaAPISuite.wholeTextFiles(JavaAPISuite.java:1089)
...

After

[info] Test org.apache.spark.JavaAPISuite.wholeTextFiles started
...
[info] Test org.apache.spark.JavaAPISuite.writeWithNewAPIHadoopFile started
[error] Test org.apache.spark.JavaAPISuite.writeWithNewAPIHadoopFile failed: org.apache.spark.SparkException: Job aborted., took 0.407 sec

FWIW, It seems the root exception for org.apache.spark.JavaAPISuite.writeWithNewAPIHadoopFile was

java.io.IOException: Could not rename file:/C:/projects/spark/target/tmp/1479126719376-
0/output/_temporary/0/_temporary/attempt_20161114123159_0000_r_000000_0 to 
file:/C:/projects/spark/target/tmp/1479126719376-
0/output/_temporary/0/task_20161114123159_0000_r_000000

@srowen
Copy link
Member

srowen commented Nov 18, 2016

OK if it makes the test in question here pass, let's merge it, as a fix for that test. That much is OK right @HyukjinKwon ?

@HyukjinKwon
Copy link
Member Author

Yes please. I am pretty sure that this is a right fix without any other side effects.

asfgit pushed a commit that referenced this pull request Nov 18, 2016
…aAPISuite

## What changes were proposed in this pull request?

This PR fixes the test `wholeTextFiles` in `JavaAPISuite.java`. This is failed due to the different path format on Windows.

For example, the path in `container` was

```
C:\projects\spark\target\tmp\1478967560189-0/part-00000
```

whereas `new URI(res._1()).getPath()` was as below:

```
/C:/projects/spark/target/tmp/1478967560189-0/part-00000
```

## How was this patch tested?

Tests in `JavaAPISuite.java`.

Tested via AppVeyor.

**Before**
Build: https://ci.appveyor.com/project/spark-test/spark/build/63-JavaAPISuite-1
Diff: master...spark-test:JavaAPISuite-1

```
[info] Test org.apache.spark.JavaAPISuite.wholeTextFiles started
[error] Test org.apache.spark.JavaAPISuite.wholeTextFiles failed: java.lang.AssertionError: expected:<spark is easy to use.
[error] > but was:<null>, took 0.578 sec
[error]     at org.apache.spark.JavaAPISuite.wholeTextFiles(JavaAPISuite.java:1089)
...
```

**After**
Build started: [CORE] `org.apache.spark.JavaAPISuite` [![PR-15866](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=198DDA52-F201-4D2B-BE2F-244E0C1725B2&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/198DDA52-F201-4D2B-BE2F-244E0C1725B2)
Diff: master...spark-test:198DDA52-F201-4D2B-BE2F-244E0C1725B2

```
[info] Test org.apache.spark.JavaAPISuite.wholeTextFiles started
...
```

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #15866 from HyukjinKwon/SPARK-18422.

(cherry picked from commit 40d59ff)
Signed-off-by: Sean Owen <sowen@cloudera.com>
@srowen
Copy link
Member

srowen commented Nov 18, 2016

Merged to master/2.1

@asfgit asfgit closed this in 40d59ff Nov 18, 2016
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…aAPISuite

## What changes were proposed in this pull request?

This PR fixes the test `wholeTextFiles` in `JavaAPISuite.java`. This is failed due to the different path format on Windows.

For example, the path in `container` was

```
C:\projects\spark\target\tmp\1478967560189-0/part-00000
```

whereas `new URI(res._1()).getPath()` was as below:

```
/C:/projects/spark/target/tmp/1478967560189-0/part-00000
```

## How was this patch tested?

Tests in `JavaAPISuite.java`.

Tested via AppVeyor.

**Before**
Build: https://ci.appveyor.com/project/spark-test/spark/build/63-JavaAPISuite-1
Diff: apache/spark@master...spark-test:JavaAPISuite-1

```
[info] Test org.apache.spark.JavaAPISuite.wholeTextFiles started
[error] Test org.apache.spark.JavaAPISuite.wholeTextFiles failed: java.lang.AssertionError: expected:<spark is easy to use.
[error] > but was:<null>, took 0.578 sec
[error]     at org.apache.spark.JavaAPISuite.wholeTextFiles(JavaAPISuite.java:1089)
...
```

**After**
Build started: [CORE] `org.apache.spark.JavaAPISuite` [![PR-15866](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=198DDA52-F201-4D2B-BE2F-244E0C1725B2&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/198DDA52-F201-4D2B-BE2F-244E0C1725B2)
Diff: apache/spark@master...spark-test:198DDA52-F201-4D2B-BE2F-244E0C1725B2

```
[info] Test org.apache.spark.JavaAPISuite.wholeTextFiles started
...
```

Author: hyukjinkwon <gurwls223@gmail.com>

Closes apache#15866 from HyukjinKwon/SPARK-18422.
@HyukjinKwon HyukjinKwon deleted the SPARK-18422 branch January 2, 2018 03:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants