Skip to content

Conversation

@steveloughran
Copy link
Contributor

HADOOP-16109. Parquet reading S3AFileSystem causes EOF

Nobody gets seek right. No matter how many times they think they have.

Reproducible test from: Dave Christianson
Fixed seek() logic: Steve Loughran

public ITestS3AContractSeek(final String seekPolicy) {
this.seekPolicy = seekPolicy;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:end of line

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 29 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 1021 trunk passed
+1 compile 32 trunk passed
+1 checkstyle 21 trunk passed
+1 mvnsite 33 trunk passed
+1 shadedclient 672 branch has no errors when building and testing our client artifacts.
+1 findbugs 40 trunk passed
+1 javadoc 22 trunk passed
_ Patch Compile Tests _
-1 mvninstall 26 hadoop-aws in the patch failed.
-1 compile 25 hadoop-aws in the patch failed.
-1 javac 25 hadoop-aws in the patch failed.
-0 checkstyle 16 hadoop-tools/hadoop-aws: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
-1 mvnsite 26 hadoop-aws in the patch failed.
-1 whitespace 0 The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 shadedclient 654 patch has no errors when building and testing our client artifacts.
-1 findbugs 20 hadoop-aws in the patch failed.
+1 javadoc 19 the patch passed
_ Other Tests _
-1 unit 28 hadoop-aws in the patch failed.
+1 asflicense 21 The patch does not generate ASF License warnings.
2763
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-539/1/artifact/out/Dockerfile
GITHUB PR #539
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux b3467d524f1e 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / dcaca19
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
mvninstall https://builds.apache.org/job/hadoop-multibranch/job/PR-539/1/artifact/out/patch-mvninstall-hadoop-tools_hadoop-aws.txt
compile https://builds.apache.org/job/hadoop-multibranch/job/PR-539/1/artifact/out/patch-compile-hadoop-tools_hadoop-aws.txt
javac https://builds.apache.org/job/hadoop-multibranch/job/PR-539/1/artifact/out/patch-compile-hadoop-tools_hadoop-aws.txt
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-539/1/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
mvnsite https://builds.apache.org/job/hadoop-multibranch/job/PR-539/1/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt
whitespace https://builds.apache.org/job/hadoop-multibranch/job/PR-539/1/artifact/out/whitespace-eol.txt
findbugs https://builds.apache.org/job/hadoop-multibranch/job/PR-539/1/artifact/out/patch-findbugs-hadoop-tools_hadoop-aws.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-539/1/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-539/1/testReport/
Max. process+thread count 411 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-539/1/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 26 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 987 trunk passed
+1 compile 33 trunk passed
+1 checkstyle 23 trunk passed
+1 mvnsite 36 trunk passed
+1 shadedclient 684 branch has no errors when building and testing our client artifacts.
+1 findbugs 44 trunk passed
+1 javadoc 25 trunk passed
_ Patch Compile Tests _
-1 mvninstall 28 hadoop-aws in the patch failed.
-1 compile 27 hadoop-aws in the patch failed.
-1 javac 27 hadoop-aws in the patch failed.
-0 checkstyle 17 hadoop-tools/hadoop-aws: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
-1 mvnsite 29 hadoop-aws in the patch failed.
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 729 patch has no errors when building and testing our client artifacts.
-1 findbugs 21 hadoop-aws in the patch failed.
+1 javadoc 21 the patch passed
_ Other Tests _
-1 unit 38 hadoop-aws in the patch failed.
+1 asflicense 33 The patch does not generate ASF License warnings.
2867
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-539/2/artifact/out/Dockerfile
GITHUB PR #539
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 08a0191f4827 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 15098df
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
mvninstall https://builds.apache.org/job/hadoop-multibranch/job/PR-539/2/artifact/out/patch-mvninstall-hadoop-tools_hadoop-aws.txt
compile https://builds.apache.org/job/hadoop-multibranch/job/PR-539/2/artifact/out/patch-compile-hadoop-tools_hadoop-aws.txt
javac https://builds.apache.org/job/hadoop-multibranch/job/PR-539/2/artifact/out/patch-compile-hadoop-tools_hadoop-aws.txt
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-539/2/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
mvnsite https://builds.apache.org/job/hadoop-multibranch/job/PR-539/2/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt
findbugs https://builds.apache.org/job/hadoop-multibranch/job/PR-539/2/artifact/out/patch-findbugs-hadoop-tools_hadoop-aws.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-539/2/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-539/2/testReport/
Max. process+thread count 446 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-539/2/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:end of line

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:end of line

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:end of line

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:end of line

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:end of line

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 25 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
0 mvndep 63 Maven dependency ordering for branch
+1 mvninstall 997 trunk passed
+1 compile 925 trunk passed
+1 checkstyle 185 trunk passed
+1 mvnsite 102 trunk passed
+1 shadedclient 987 branch has no errors when building and testing our client artifacts.
+1 findbugs 141 trunk passed
+1 javadoc 83 trunk passed
_ Patch Compile Tests _
0 mvndep 19 Maven dependency ordering for patch
-1 mvninstall 26 hadoop-aws in the patch failed.
-1 compile 835 root in the patch failed.
-1 javac 835 root in the patch failed.
-0 checkstyle 179 root: The patch generated 6 new + 10 unchanged - 0 fixed = 16 total (was 10)
-1 mvnsite 35 hadoop-aws in the patch failed.
-1 whitespace 0 The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 shadedclient 602 patch has no errors when building and testing our client artifacts.
-1 findbugs 25 hadoop-aws in the patch failed.
+1 javadoc 89 the patch passed
_ Other Tests _
-1 unit 502 hadoop-common in the patch failed.
-1 unit 36 hadoop-aws in the patch failed.
+1 asflicense 33 The patch does not generate ASF License warnings.
6052
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-539/3/artifact/out/Dockerfile
GITHUB PR #539
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux ecfb19d4e7fe 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 15098df
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
mvninstall https://builds.apache.org/job/hadoop-multibranch/job/PR-539/3/artifact/out/patch-mvninstall-hadoop-tools_hadoop-aws.txt
compile https://builds.apache.org/job/hadoop-multibranch/job/PR-539/3/artifact/out/patch-compile-root.txt
javac https://builds.apache.org/job/hadoop-multibranch/job/PR-539/3/artifact/out/patch-compile-root.txt
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-539/3/artifact/out/diff-checkstyle-root.txt
mvnsite https://builds.apache.org/job/hadoop-multibranch/job/PR-539/3/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt
whitespace https://builds.apache.org/job/hadoop-multibranch/job/PR-539/3/artifact/out/whitespace-eol.txt
findbugs https://builds.apache.org/job/hadoop-multibranch/job/PR-539/3/artifact/out/patch-findbugs-hadoop-tools_hadoop-aws.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-539/3/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-539/3/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-539/3/testReport/
Max. process+thread count 1390 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-539/3/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

hadoop-common test run => failure. hadoop-aws compilation failure with a file which has been deleted

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 39 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
0 mvndep 22 Maven dependency ordering for branch
+1 mvninstall 1213 trunk passed
+1 compile 1545 trunk passed
+1 checkstyle 246 trunk passed
+1 mvnsite 161 trunk passed
+1 shadedclient 1271 branch has no errors when building and testing our client artifacts.
+1 findbugs 188 trunk passed
+1 javadoc 119 trunk passed
_ Patch Compile Tests _
0 mvndep 27 Maven dependency ordering for patch
-1 mvninstall 31 hadoop-aws in the patch failed.
-1 compile 1315 root in the patch failed.
-1 javac 1315 root in the patch failed.
-0 checkstyle 236 root: The patch generated 3 new + 10 unchanged - 0 fixed = 13 total (was 10)
-1 mvnsite 46 hadoop-aws in the patch failed.
+1 whitespace 1 The patch has no whitespace issues.
+1 shadedclient 835 patch has no errors when building and testing our client artifacts.
-1 findbugs 42 hadoop-aws in the patch failed.
+1 javadoc 109 the patch passed
_ Other Tests _
-1 unit 569 hadoop-common in the patch failed.
-1 unit 47 hadoop-aws in the patch failed.
+1 asflicense 51 The patch does not generate ASF License warnings.
8285
Reason Tests
Failed junit tests hadoop.ipc.TestRPC
hadoop.ipc.TestCallQueueManager
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-539/4/artifact/out/Dockerfile
GITHUB PR #539
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 39a4f52a9a2d 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / cb0fa0c
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
mvninstall https://builds.apache.org/job/hadoop-multibranch/job/PR-539/4/artifact/out/patch-mvninstall-hadoop-tools_hadoop-aws.txt
compile https://builds.apache.org/job/hadoop-multibranch/job/PR-539/4/artifact/out/patch-compile-root.txt
javac https://builds.apache.org/job/hadoop-multibranch/job/PR-539/4/artifact/out/patch-compile-root.txt
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-539/4/artifact/out/diff-checkstyle-root.txt
mvnsite https://builds.apache.org/job/hadoop-multibranch/job/PR-539/4/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt
findbugs https://builds.apache.org/job/hadoop-multibranch/job/PR-539/4/artifact/out/patch-findbugs-hadoop-tools_hadoop-aws.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-539/4/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-539/4/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-539/4/testReport/
Max. process+thread count 793 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-539/4/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

OK, fun little Yetus issue for @aw-was-here

  • The unit tests are failing because ITestS3AContractRandomSeek.java won't compile
  • That file did exist, but was deleted in one of the changes in the patch sequence
  • Somehow maven is still finding that file in its source tree

@mattf-apache
Copy link
Member

mattf-apache commented Mar 5, 2019

@steveloughran , does Yetus perhaps try to test each commit individually? Since the file is being created in the first commit and deleted in the second commit, it might be worth squashing this PR and seeing if Yetus likes that better.

[Later Edit] It looks like maybe a GitHub issue. Taking a fresh clone and doing:

curl -s -S https://patch-diff.githubusercontent.com/raw/apache/hadoop/pull/539.patch | git apply -

results in the presence of an untracked not-empty ITestS3AContractRandomSeek.java file:

$ git status
On branch HADOOP-16109
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractSeekTest.java
	modified:   hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java
	modified:   hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractSeek.java

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRandomSeek.java

$ ls -l hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRandomSeek.java
-rw-r--r--  1 matt.foley  staff  1274 Mar  5 14:17 hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRandomSeek.java

Copy link
Member

@mattf-apache mattf-apache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steveloughran , thanks so much for creating all these test cases. They really do what's needed. I did request one additional case, at the end.

The other comments are mostly editorial nits on internal documentation. Please forgive :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the parameterized and test settings are ... what?

Also nit: readahead not readhead

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same line twice, typo?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document allowed or intended values of seekPolicy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are defined in the Constants.INPUT_FADV. vars

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest moving testReadAcrossReadahead and its Javadoc down to just above testReadSingleByteAcrossReadahead, and perhaps edit Javadoc to say

   * Tests for HADOOP-16109: Parquet reading S3AFileSystem causes EOF.
   * The next four test cases assure correct behavior when seeks and reads
   * touch or cross the end of the active S3A readahead buffer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to have a describe() here too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do. If you've noticed, these get logged, so help you separate "setup/teardown noise" from the test run itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in sequential mode

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, this is "read from immediately after boundary"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is definitely one of the major cases of concern.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, important test case. Could we also add a case that does:

    try (FSDataInputStream in = fs.open(path)) {
      readAtEndAndReturn(in);
      final byte[] temp = new byte[1];
      int offset = READAHEAD;
      in.seek(offset);
      // expect to read a byte successfully.
      temp[0] = in.readByte();
      assertDatasetEquals(READAHEAD, "read at end of boundary", temp, 1);
      LOG.info("Read of byte at offset {} returned expected value", offset);
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well ok, though as readByte is read + an assert that the value >= 0, you don't really gain much.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I guess I was thinking of readFully(), which in S3AInputStream.java can cause a new seek and thereby change the cache boundaries. You're right it's not relevant here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in random-access mode

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing the comment to avoid confusion

@steveloughran
Copy link
Contributor Author

new patch: address core comments from @mattf-apache , including adding a new test case.

Matt, if you ever look at readByte(), it is just calling read and checking the return value before casting

    public final byte readByte() throws IOException {
        int ch = in.read();
        if (ch < 0)
            throw new EOFException();
        return (byte)(ch);
    }

the first test was effectively doing it. But in case something in future did ever subclass it, I've added the new test case

@steveloughran
Copy link
Contributor Author

tested: S3 ireland

Copy link
Member

@mattf-apache mattf-apache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff, thanks @steveloughran !

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 25 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
0 mvndep 67 Maven dependency ordering for branch
+1 mvninstall 1001 trunk passed
-1 compile 509 root in trunk failed.
+1 checkstyle 168 trunk passed
+1 mvnsite 98 trunk passed
+1 shadedclient 898 branch has no errors when building and testing our client artifacts.
+1 findbugs 133 trunk passed
+1 javadoc 69 trunk passed
_ Patch Compile Tests _
0 mvndep 22 Maven dependency ordering for patch
-1 mvninstall 27 hadoop-aws in the patch failed.
-1 compile 532 root in the patch failed.
-1 javac 532 root in the patch failed.
-0 checkstyle 176 root: The patch generated 3 new + 10 unchanged - 0 fixed = 13 total (was 10)
-1 mvnsite 36 hadoop-aws in the patch failed.
-1 whitespace 0 The patch has 75 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-1 whitespace 1 The patch 19849 line(s) with tabs.
+1 shadedclient 910 patch has no errors when building and testing our client artifacts.
-1 findbugs 28 hadoop-aws in the patch failed.
+1 javadoc 83 the patch passed
_ Other Tests _
-1 unit 497 hadoop-common in the patch failed.
-1 unit 34 hadoop-aws in the patch failed.
+1 asflicense 35 The patch does not generate ASF License warnings.
5550
Reason Tests
Failed junit tests hadoop.util.TestReadWriteDiskValidator
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-539/5/artifact/out/Dockerfile
GITHUB PR #539
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 3b44191e02d6 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 45f976f
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
compile https://builds.apache.org/job/hadoop-multibranch/job/PR-539/5/artifact/out/branch-compile-root.txt
findbugs v3.1.0-RC1
mvninstall https://builds.apache.org/job/hadoop-multibranch/job/PR-539/5/artifact/out/patch-mvninstall-hadoop-tools_hadoop-aws.txt
compile https://builds.apache.org/job/hadoop-multibranch/job/PR-539/5/artifact/out/patch-compile-root.txt
javac https://builds.apache.org/job/hadoop-multibranch/job/PR-539/5/artifact/out/patch-compile-root.txt
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-539/5/artifact/out/diff-checkstyle-root.txt
mvnsite https://builds.apache.org/job/hadoop-multibranch/job/PR-539/5/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt
whitespace https://builds.apache.org/job/hadoop-multibranch/job/PR-539/5/artifact/out/whitespace-eol.txt
whitespace https://builds.apache.org/job/hadoop-multibranch/job/PR-539/5/artifact/out/whitespace-tabs.txt
findbugs https://builds.apache.org/job/hadoop-multibranch/job/PR-539/5/artifact/out/patch-findbugs-hadoop-tools_hadoop-aws.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-539/5/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-539/5/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-539/5/testReport/
Max. process+thread count 1717 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-539/5/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@aw-was-here
Copy link
Contributor

TL;DR: squash and rebase regularly

does Yetus perhaps try to test each commit individually? ... [Later Edit] It looks like maybe a GitHub issue.

Yup.

Yetus gets the equivalent of 'git format-patch' and then calls 'git apply' or 'patch' or whatever works on top of master (or branch-2 or whatever) with it. This has several downsides, but most of them go away if the patch branch is regularly rebased and commits are squashed. The latter fixes 99% of the issues. When commits aren't squashed, you'll see weird stuff like what shows up here. (Remember, the top of the other tree is moving too....)

It's important to note that git/github really only provides three ways to handle PRs:

  • git merge
  • git diff
  • git format-patch

The first one has lots of issues from a Yetus functionality perspective since it taints the source tree and doing multiple checkouts has other issues (remember: yetus also runs locally!). The second one doesn't work with binary files. That leaves us with the third one, which has lots of weird idiosyncracies but works when good branch hygiene is in play. YETUS-724 will enable test-patch to switch to using the git diff IFF the format-patch version of the file can't be applied.

Nobody gets seek right. No matter how many times they think they have.

Reproducible test from: Dave Christianson
Fixed seek() logic: Steve Loughran

Change-Id: I4b08036b9e71e2a3c23b9da697bfc23e279ce44b
@steveloughran steveloughran force-pushed the s3/HADOOP-16109-parquet-eof-s3a-seek branch from 1afe7ab to cc41761 Compare March 8, 2019 11:46
@steveloughran
Copy link
Contributor Author

squash and resubmit

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
0 reexec 26 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
0 mvndep 58 Maven dependency ordering for branch
+1 mvninstall 1121 trunk passed
+1 compile 1117 trunk passed
+1 checkstyle 203 trunk passed
+1 mvnsite 144 trunk passed
+1 shadedclient 1021 branch has no errors when building and testing our client artifacts.
+1 findbugs 150 trunk passed
+1 javadoc 87 trunk passed
_ Patch Compile Tests _
0 mvndep 23 Maven dependency ordering for patch
+1 mvninstall 79 the patch passed
+1 compile 1076 the patch passed
+1 javac 1076 the patch passed
-0 checkstyle 203 root: The patch generated 2 new + 10 unchanged - 0 fixed = 12 total (was 10)
+1 mvnsite 125 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 674 patch has no errors when building and testing our client artifacts.
+1 findbugs 179 the patch passed
+1 javadoc 95 the patch passed
_ Other Tests _
+1 unit 558 hadoop-common in the patch passed.
+1 unit 272 hadoop-aws in the patch passed.
+1 asflicense 41 The patch does not generate ASF License warnings.
7145
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-539/6/artifact/out/Dockerfile
GITHUB PR #539
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux b3b35016f438 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / fb851c9
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-539/6/artifact/out/diff-checkstyle-root.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-539/6/testReport/
Max. process+thread count 1463 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-539/6/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

Squashed PR is happy. I need to do the full local test run and declare that before I can say this is ready to go in

@steveloughran
Copy link
Contributor Author

local test w/ s3 london + (s3guard). Failure in test setup of all three the big seek tests, even when parameterized names were used.

[ERROR] testSeekBigFile[0](org.apache.hadoop.fs.contract.s3a.ITestS3AContractSeek)  Time elapsed: 0.557 s  <<< ERROR!
org.apache.hadoop.fs.FileAlreadyExistsException: s3a://hwdev-steve-london/fork-0002/test/bigseekfile.txt-random already exists
	at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:963)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1130)

...
[ERROR] testSeekBigFile[1](org.apache.hadoop.fs.contract.s3a.ITestS3AContractSeek)  Time elapsed: 0.479 s  <<< ERROR!
org.apache.hadoop.fs.FileAlreadyExistsException: s3a://hwdev-steve-london/fork-0002/test/bigseekfile.txt-normal already exists
	at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:963)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1130)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1110)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:999)
	at org.apache.hadoop.fs.contract.ContractTestUtils.createFile(ContractTestUtils.java:633)
	at org.apache.hadoop.fs.contract.AbstractContractSeekTest.testSeekBigFile(AbstractContractSeekTest.java:275)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

...

[ERROR] testSeekBigFile[2](org.apache.hadoop.fs.contract.s3a.ITestS3AContractSeek)  Time elapsed: 0.958 s  <<< ERROR!
org.apache.hadoop.fs.FileAlreadyExistsException: s3a://hwdev-steve-london/fork-0002/test/bigseekfile.txt-sequential already exists
	at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:963)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1130)

Not directly related, but I'll change the creation to overwrite=true to ensure this goes away

@steveloughran
Copy link
Contributor Author

the seek contract test in this PR does set overwrite=true; build must have picked up a version of the hadoop-common tests jar from a different branch. A full clean build fixed the test run.
All is good

@mattf-apache you happy with this? You have voting rights....

Copy link
Member

@mattf-apache mattf-apache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely!

@steveloughran
Copy link
Contributor Author

thanks.

For the record, this was applied via

 dev-support/bin/smart-apply-patch --project=hadoop GH:539

I'll be submitting a followup PR for branch-3.2 and earlier (doesn't compile due to no method removeBucketOverrides(java.lang.String,org.apache.hadoop.conf.Configuration,java.lang.String,java.lang.String)) which I'll have to pull in too, then all the way back to 2.8.x. Joy. This is is the price of getting seek wrong

@mattf-apache
Copy link
Member

Thanks, @steveloughran !

shanthoosh pushed a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019
…ns in KeyValueStorageEngine

Author: Prateek Maheshwari <pmaheshwari@linkedin.com>

Reviewers: Cameron Lee <calee@linkedin.com>, Shanthoosh Venkatraman <svenkatr@linkedin.com>

Closes apache#539 from prateekm/store-metrics
@steveloughran steveloughran deleted the s3/HADOOP-16109-parquet-eof-s3a-seek branch October 15, 2021 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants