Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-32231][R][INFRA] Use Hadoop 3.2 winutils in AppVeyor build #29042

Closed
wants to merge 3 commits into from

Conversation

HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Jul 8, 2020

What changes were proposed in this pull request?

This PR proposes to use Hadoop 3 winutils to make AppVeyor builds pass. Currently it's being failed as below
https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/builds/33976604

Why are the changes needed?

To recover the build in AppVeyor.

Does this PR introduce any user-facing change?

No, dev-only.

How was this patch tested?

AppVeyor build will test it out.

@HyukjinKwon
Copy link
Member Author

cc @dongjoon-hyun FYI

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Jul 8, 2020

Thank you, @HyukjinKwon ! Let's see.

@dongjoon-hyun
Copy link
Member

cc @steveloughran , too.

@SparkQA
Copy link

SparkQA commented Jul 8, 2020

Test build #125378 has finished for PR 29042 at commit 492af87.

  • This patch fails to generate documentation.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

4 failures in org.apache.hadoop.io.nativeio.NativeIO$POSIX.stat. It seems Hadoop 3.0 bug on Windows OS.

== testthat results  ===========================================================
2086[ OK: 13 | SKIPPED: 0 | WARNINGS: 0 | FAILED: 0 ]
2087v |  OK F W S | Context
2088v |  11       | binary functions [35.6 s]
2089x |   0 4     | functions on binary files [7.6 s]
2090--------------------------------------------------------------------------------
2091test_binaryFile.R:34: error: saveAsObjectFile()/objectFile() following textFile() works
2092java.lang.UnsatisfiedLinkorg.apache.hadoop.io.nativeio.NativeIO$POSIX.stat(Ljava/lang/String;)Lorg/apache/hadoop/io/nativeio/NativeIO$POSIX$Stat;
2093	at org.apache.hadoop.io.nativeio.NativeIO$POSIX.stat(Native Method)
2094	at org.apache.hadoop.io.nativeio.NativeIO$POSIX.getStat(NativeIO.java:460)
2095

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Jul 8, 2020

Hi, @steveloughran . Sorry, but do you know if there is a known issue about Hadoop 3.x on Windows? Does Hadoop support Windows still? Or, is it dropped?

dongjoon-hyun added a commit that referenced this pull request Jul 8, 2020
### What changes were proposed in this pull request?

This PR aims to disable SBT `unidoc` generation testing in Jenkins environment because it's flaky in Jenkins environment and not used for the official documentation generation. Also, GitHub Action has the correct test coverage for the official documentation generation.

- #28848 (comment) (amp-jenkins-worker-06)
- #28926 (comment) (amp-jenkins-worker-06)
- #28969 (comment) (amp-jenkins-worker-06)
- #28975 (comment) (amp-jenkins-worker-05)
- #28986 (comment)  (amp-jenkins-worker-05)
- #28992 (comment) (amp-jenkins-worker-06)
- #28993 (comment) (amp-jenkins-worker-05)
- #28999 (comment) (amp-jenkins-worker-04)
- #29010 (comment) (amp-jenkins-worker-03)
- #29013 (comment) (amp-jenkins-worker-04)
- #29016 (comment) (amp-jenkins-worker-05)
- #29025 (comment) (amp-jenkins-worker-04)
- #29042 (comment) (amp-jenkins-worker-03)

### Why are the changes needed?

Apache Spark `release-build.sh` generates the official document by using the following command.
- https://github.com/apache/spark/blob/master/dev/create-release/release-build.sh#L341

```bash
PRODUCTION=1 RELEASE_VERSION="$SPARK_VERSION" jekyll build
```

And, this is executed by the following `unidoc` command for Scala/Java API doc.
- https://github.com/apache/spark/blob/master/docs/_plugins/copy_api_dirs.rb#L30

```ruby
system("build/sbt -Pkinesis-asl clean compile unidoc") || raise("Unidoc generation failed")
```

However, the PR builder disabled `Jekyll build` and instead has a different test coverage.
```python
# determine if docs were changed and if we're inside the amplab environment
# note - the below commented out until *all* Jenkins workers can get `jekyll` installed
# if "DOCS" in changed_modules and test_env == "amplab_jenkins":
#    build_spark_documentation()
```

```
Building Unidoc API Documentation
========================================================================
[info] Building Spark unidoc using SBT with these arguments:
-Phadoop-3.2 -Phive-2.3 -Pspark-ganglia-lgpl -Pkubernetes -Pmesos
-Phadoop-cloud -Phive -Phive-thriftserver -Pkinesis-asl -Pyarn unidoc
```

### Does this PR introduce _any_ user-facing change?

No. (This is used only for testing and not used in the official doc generation.)

### How was this patch tested?

Pass the Jenkins without doc generation invocation.

Closes #29017 from dongjoon-hyun/SPARK-DOC-GEN.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Jul 9, 2020

For this PR, I'm testing on Windows. I'll make a PR to you, @HyukjinKwon .

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Jul 9, 2020

Hi, @HyukjinKwon . I made a PR to you. Could you review and merge that?

@HyukjinKwon
Copy link
Member Author

Awesome!

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-32231][R][INFRA] Use Hadoop 3 winutils in AppVeyor build [SPARK-32231][R][INFRA] Use Hadoop 3.2.0 winutils in AppVeyor build Jul 9, 2020
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-32231][R][INFRA] Use Hadoop 3.2.0 winutils in AppVeyor build [SPARK-32231][R][INFRA] Use Hadoop 3.2 winutils in AppVeyor build Jul 9, 2020
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. AppVeyor passed.

@HyukjinKwon
Copy link
Member Author

Merged to master.

Thanks, @dongjoon-hyun.

@SparkQA
Copy link

SparkQA commented Jul 9, 2020

Test build #125440 has finished for PR 29042 at commit f432ddc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon HyukjinKwon deleted the SPARK-32231 branch July 27, 2020 07:43
dongjoon-hyun pushed a commit that referenced this pull request Nov 14, 2023
### What changes were proposed in this pull request?
The pr aims to use Hadoop `3.3.5` winutils in AppVeyor build.

### Why are the changes needed?
The last update occurred 3 years ago, #29042

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass `AppVeyor` test.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #43777 from panbingkun/appveyor_hadoop.

Authored-by: panbingkun <pbk1982@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants