Skip to content

Conversation

PeterPtroc
Copy link
Contributor

Description of PR

Add a RISC-V-specific compilation unit: org/apache/hadoop/util/bulk_crc32_riscv.c.

  • Contains a no-op constructor reserved for future HW capability detection and dispatch.
  • Keeps runtime behavior unchanged (falls back to the generic software path in bulk_crc32.c).
  • Wire CMake to select bulk_crc32_riscv.c on riscv32/riscv64, mirroring other platforms.

This PR establishes the foundational build infrastructure for future RISC-V Zbc (CLMUL) CRC32/CRC32C acceleration without changing current behavior. Follow-ups (HADOOP-19655) will introduce HW-accelerated implementations and runtime dispatch.

How was this patch tested?

  • Ensured native build for hadoop-common compiles cleanly with RISC-V selection.
  • Verified by test_bulk_crc32.
  • No new tests added, as this patch is scaffolding-only without any behavior change.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 22m 29s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 41m 17s trunk passed
+1 💚 compile 14m 21s trunk passed
-1 ❌ mvnsite 1m 56s /branch-mvnsite-hadoop-common-project_hadoop-common.txt hadoop-common in trunk failed.
+1 💚 shadedclient 94m 44s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 11s the patch passed
+1 💚 compile 13m 13s the patch passed
+1 💚 cc 13m 13s the patch passed
+1 💚 golang 13m 13s the patch passed
+1 💚 javac 13m 13s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-1 ❌ mvnsite 1m 54s /patch-mvnsite-hadoop-common-project_hadoop-common.txt hadoop-common in the patch failed.
+1 💚 shadedclient 38m 33s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 23m 54s hadoop-common in the patch passed.
+1 💚 asflicense 1m 57s The patch does not generate ASF License warnings.
198m 14s
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7903/1/artifact/out/Dockerfile
GITHUB PR #7903
Optional Tests dupname asflicense compile cc mvnsite javac unit codespell detsecrets golang
uname Linux af19edaaec5b 5.15.0-143-generic #153-Ubuntu SMP Fri Jun 13 19:10:45 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 1b159b6
Default Java Red Hat, Inc.-1.8.0_312-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7903/1/testReport/
Max. process+thread count 1279 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7903/1/console
versions git=2.27.0 maven=3.6.3
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 35s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 41m 2s trunk passed
+1 💚 compile 14m 4s trunk passed
-1 ❌ mvnsite 1m 55s /branch-mvnsite-hadoop-common-project_hadoop-common.txt hadoop-common in trunk failed.
+1 💚 shadedclient 94m 56s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 12s the patch passed
+1 💚 compile 13m 25s the patch passed
+1 💚 cc 13m 25s the patch passed
+1 💚 golang 13m 25s the patch passed
+1 💚 javac 13m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-1 ❌ mvnsite 1m 57s /patch-mvnsite-hadoop-common-project_hadoop-common.txt hadoop-common in the patch failed.
+1 💚 shadedclient 38m 42s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 23m 39s hadoop-common in the patch passed.
+1 💚 asflicense 1m 53s The patch does not generate ASF License warnings.
176m 33s
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7903/2/artifact/out/Dockerfile
GITHUB PR #7903
Optional Tests dupname asflicense compile cc mvnsite javac unit codespell detsecrets golang
uname Linux c0759ac9ab3e 5.15.0-143-generic #153-Ubuntu SMP Fri Jun 13 19:10:45 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 3d607d4
Default Java Red Hat, Inc.-1.8.0_312-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7903/2/testReport/
Max. process+thread count 2145 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7903/2/console
versions git=2.27.0 maven=3.6.3
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@PeterPtroc
Copy link
Contributor Author

Due to some CI infrastructure issues, I will paste the result of validating this patch on a RISC-V machine. Below are the command and the results.

Command:

mvn -Pnative \
  -Dtest=org.apache.hadoop.util.TestNativeCrc32 \
  -Djava.library.path="$HADOOP_COMMON_LIB_NATIVE_DIR" \
  test

Results

[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.util.TestNativeCrc32
[INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 30.72 s -- in org.apache.hadoop.util.TestNativeCrc32
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0

@PeterPtroc PeterPtroc marked this pull request as ready for review August 27, 2025 16:33
@PeterPtroc
Copy link
Contributor Author

Hi @pan3793 @slfan1989 , could you please take a look when you have a moment? This PR adds RISC-V CRC32 scaffolding and keeps behavior unchanged. Happy to address any feedback. Thanks!

@pan3793
Copy link
Member

pan3793 commented Aug 28, 2025

@PeterPtroc I suppose most developers here do not have RISC-V env, is it possible to have a docs about how to verify it by leveraging QEMU or some common tools?

@PeterPtroc
Copy link
Contributor Author

PeterPtroc commented Aug 29, 2025

@pan3793 Thanks for the suggestion! Below is a concise doc to verify the correctness of the crc32riscv implementation:

I mainly verify on RISC‑V by using QEMU together with the openEuler RISC‑V image.

Download the image

For me, from the above link, download these four files: RISCV_VIRT_CODE.fd, RISCV_VIRT_VARS.fd, openEuler-25.03-riscv64.qcow2.xz, and start_vm.sh; then log in as root with the password: openEuler12#$.

Install required packages

yum install -y gcc gcc-c++ gcc-gfortran libgcc cmake
yum install -y wget openssl openssl-devel zlib zlib-devel automake libtool make libstdc++-static glibc-static git snappy snappy-devel fuse fuse-devel doxygen clang cyrus-sasl cyrus-sasl-devel libtirpc libtirpc-devel
yum install -y java-17-openjdk.riscv64 java-17-openjdk-devel.riscv64 java-17-openjdk-headless.riscv64

Install Protobuf 2.5.0 (with RISC‑V patches)

mkdir protobuf && cd protobuf

# Fetch sources
git clone https://gitee.com/src-openeuler/protobuf2.git
cd protobuf2
tar -xjf protobuf-2.5.0.tar.bz2
cp *.patch protobuf-2.5.0 && cd protobuf-2.5.0

# Apply patches (adds riscv64 support and build fixes)
patch -p1 < 0001-Add-generic-GCC-support-for-atomic-operations.patch
patch -p1 < protobuf-2.5.0-gtest.patch
patch -p1 < protobuf-2.5.0-java-fixes.patch
patch -p1 < protobuf-2.5.0-makefile.patch
patch -p1 < add-riscv64-support.patch

# Autotools setup
libtoolize
yum install -y automake
automake-1.17 -a
chmod +x configure

# Configure, build, install
./configure --build=riscv64-unknown-linux --prefix=/usr/local/protobuf-2.5.0
make
make check
make install
ldconfig

# Publish protoc 2.5.0 into local Maven repo (riscv64 classifier)
mvn install:install-file \
  -DgroupId=com.google.protobuf \
  -DartifactId=protoc \
  -Dversion=2.5.0 \
  -Dclassifier=linux-riscv64 \
  -Dpackaging=exe \
  -Dfile=/usr/local/protobuf-2.5.0/bin/protoc

cd ..

Install Protobuf 3.25.5

# Download and unpack
wget -c https://github.com/protocolbuffers/protobuf/releases/download/v25.5/protobuf-25.5.tar.gz
tar -xzf protobuf-25.5.tar.gz
cd protobuf-25.5

# Abseil dependency
git clone https://github.com/abseil/abseil-cpp third_party/abseil-cpp

# Configure and build
cmake ./ \
  -DCMAKE_BUILD_TYPE=RELEASE \
  -Dprotobuf_BUILD_TESTS=off \
  -DCMAKE_CXX_STANDARD=20 \
  -DCMAKE_INSTALL_PREFIX=/usr/local/protobuf-3.25.5

make install -j "$(nproc)"

# Publish protoc 3.25.5 into local Maven repo (riscv64 classifier)
mvn install:install-file \
  -DgroupId=com.google.protobuf \
  -DartifactId=protoc \
  -Dversion=3.25.5 \
  -Dclassifier=linux-riscv64 \
  -Dpackaging=exe \
  -Dfile=/usr/local/protobuf-3.25.5/bin/protoc

# Make protoc available on PATH and verify
sudo ln -sfn /usr/local/protobuf-3.25.5/bin/protoc /usr/local/bin/protoc
protoc --version

Verify CRC32 using Hadoop native

# Clone Hadoop
git clone https://github.com/apache/hadoop.git
cd hadoop

# Increase Maven memory
export MAVEN_OPTS="-Xmx8g -Xms6g"

# Build Hadoop Common (native enabled)
nohup mvn -pl hadoop-common-project/hadoop-common -am -Pnative -DskipTests clean install > build.log 2>&1 &

# Point to built native library directory
cd hadoop-common-project/hadoop-common
export HADOOP_COMMON_LIB_NATIVE_DIR="$PWD/target/native/target/usr/local/lib"
export LD_LIBRARY_PATH="$HADOOP_COMMON_LIB_NATIVE_DIR:$LD_LIBRARY_PATH"

# Run the CRC32 native test
nohup mvn -Pnative -Dtest=org.apache.hadoop.util.TestNativeCrc32 \
  -Djava.library.path="$HADOOP_COMMON_LIB_NATIVE_DIR" test > test.log 2>&1 &

@PeterPtroc
Copy link
Contributor Author

Hi @cnauroth , could you please have a look? This PR adds RISC-V CRC32 scaffolding and keeps behavior unchanged. Thanks!

Copy link
Member

@pan3793 pan3793 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a riscv hadoop dev container(see #7924) and verifed this PR by running

mvn -Pnative -pl :hadoop-common clean install -DskipTests -am
mvn -Pnative -pl :hadoop-common test -Dtest=org.apache.hadoop.util.TestNativeCrc32

got the same results both with and without this patch

@PeterPtroc
Copy link
Contributor Author

Hi @brumi1024 , this PR has been open for a while. Could you please take a look when you have time? Thanks!

@PeterPtroc
Copy link
Contributor Author

@steveloughran 
Hi, this PR has been open for a while. Could you please take a look when you have time? Thanks!

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 pending that newline.
this is the basic structure and if it doesn't break the build on other systems, it's not creating any issues

Co-authored-by: gong-flying <gongxiaofei24@iscas.ac.cn>
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 31m 41s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
-1 ❌ mvninstall 0m 35s /branch-mvninstall-root.txt root in trunk failed.
-1 ❌ compile 0m 34s /branch-compile-root.txt root in trunk failed.
-1 ❌ mvnsite 0m 35s /branch-mvnsite-hadoop-common-project_hadoop-common.txt hadoop-common in trunk failed.
+1 💚 shadedclient 3m 12s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
-1 ❌ mvninstall 0m 34s /patch-mvninstall-hadoop-common-project_hadoop-common.txt hadoop-common in the patch failed.
-1 ❌ compile 0m 34s /patch-compile-root.txt root in the patch failed.
-1 ❌ cc 0m 34s /patch-compile-root.txt root in the patch failed.
-1 ❌ golang 0m 34s /patch-compile-root.txt root in the patch failed.
-1 ❌ javac 0m 34s /patch-compile-root.txt root in the patch failed.
+1 💚 blanks 0m 0s The patch has no blanks issues.
-1 ❌ mvnsite 0m 35s /patch-mvnsite-hadoop-common-project_hadoop-common.txt hadoop-common in the patch failed.
+1 💚 shadedclient 1m 52s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 34s /patch-unit-hadoop-common-project_hadoop-common.txt hadoop-common in the patch failed.
+0 🆗 asflicense 0m 34s ASF License check generated no output?
41m 2s
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7903/3/artifact/out/Dockerfile
GITHUB PR #7903
Optional Tests dupname asflicense compile cc mvnsite javac unit codespell detsecrets golang
uname Linux 9c4b8be6f414 5.15.0-152-generic #162-Ubuntu SMP Wed Jul 23 09:48:42 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / d4a02dc
Default Java Red Hat, Inc.-1.8.0_462-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7903/3/testReport/
Max. process+thread count 29 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7903/3/console
versions git=2.43.7 maven=3.9.11
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 44s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
-1 ❌ mvninstall 1m 8s /branch-mvninstall-root.txt root in trunk failed.
-1 ❌ compile 1m 9s /branch-compile-root.txt root in trunk failed.
-1 ❌ mvnsite 0m 37s /branch-mvnsite-hadoop-common-project_hadoop-common.txt hadoop-common in trunk failed.
+1 💚 shadedclient 4m 24s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
-1 ❌ mvninstall 0m 26s /patch-mvninstall-hadoop-common-project_hadoop-common.txt hadoop-common in the patch failed.
-1 ❌ compile 0m 28s /patch-compile-root.txt root in the patch failed.
-1 ❌ cc 0m 28s /patch-compile-root.txt root in the patch failed.
-1 ❌ golang 0m 28s /patch-compile-root.txt root in the patch failed.
-1 ❌ javac 0m 28s /patch-compile-root.txt root in the patch failed.
+1 💚 blanks 0m 0s The patch has no blanks issues.
-1 ❌ mvnsite 1m 31s /patch-mvnsite-hadoop-common-project_hadoop-common.txt hadoop-common in the patch failed.
-1 ❌ shadedclient 2m 54s patch has errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 37s /patch-unit-hadoop-common-project_hadoop-common.txt hadoop-common in the patch failed.
+0 🆗 asflicense 0m 37s ASF License check generated no output?
12m 9s
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7903/4/artifact/out/Dockerfile
GITHUB PR #7903
Optional Tests dupname asflicense compile cc mvnsite javac unit codespell detsecrets golang
uname Linux c62a15e0c2ed 5.15.0-152-generic #162-Ubuntu SMP Wed Jul 23 09:48:42 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 8fee308
Default Java Red Hat, Inc.-1.8.0_462-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7903/4/testReport/
Max. process+thread count 51 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7903/4/console
versions git=2.43.7 maven=3.9.11
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@PeterPtroc
Copy link
Contributor Author

It seems the CI failure (unable to create new native thread) is due to a resource issue on the build agent, not related to the code changes.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@steveloughran steveloughran merged commit c83481e into apache:trunk Oct 9, 2025
1 of 2 checks passed
@steveloughran
Copy link
Contributor

Merged

  • Ignored the CI failure; it does that sometimes, and as your code is #ifdef'd out, I'm not worried. Anything bigger and we'd have to retry the CI run
  • added a [RISC-V] category for this change -if future work does the same then it'll be consistent.

@PeterPtroc
Copy link
Contributor Author

Thanks @pan3793 @steveloughran for the review and the merge!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants