Skip to content

Hadoop 16961. ABFS: Adding metrics to AbfsInputStream #2076

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 3, 2020

Conversation

mehakmeet
Copy link
Contributor

Test run by: mvn -T 1C -Dparallel-tests=abfs clean verify
Region: East US, West US

[INFO] Results:
[INFO]
[INFO] Tests run: 77, Failures: 0, Errors: 0, Skipped: 0
[INFO] Results:
[INFO]
[WARNING] Tests run: 443, Failures: 0, Errors: 0, Skipped: 70
[INFO] Results:
[INFO]
[WARNING] Tests run: 206, Failures: 0, Errors: 0, Skipped: 29

Relates to #1946

Gabor Bota and others added 2 commits June 14, 2020 08:45
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 37s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 23m 20s trunk passed
+1 💚 compile 0m 28s trunk passed
+1 💚 checkstyle 0m 21s trunk passed
+1 💚 mvnsite 0m 31s trunk passed
+1 💚 shadedclient 16m 16s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 23s trunk passed
+0 🆗 spotbugs 0m 51s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 0m 49s trunk passed
_ Patch Compile Tests _
+1 💚 mvninstall 0m 26s the patch passed
+1 💚 compile 0m 22s the patch passed
+1 💚 javac 0m 22s the patch passed
+1 💚 checkstyle 0m 15s the patch passed
+1 💚 mvnsite 0m 24s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedclient 15m 11s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 20s the patch passed
+1 💚 findbugs 0m 56s the patch passed
_ Other Tests _
+1 💚 unit 1m 22s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 27s The patch does not generate ASF License warnings.
64m 44s
Subsystem Report/Notes
Docker ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-2076/1/artifact/out/Dockerfile
GITHUB PR #2076
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 54f35b54972b 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 81d8a88
Default Java Private Build-1.8.0_252-8u252-b09-1~18.04-b09
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-2076/1/testReport/
Max. process+thread count 308 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-2076/1/console
versions git=2.17.1 maven=3.6.0 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good.

  • this includes a patch gabor committed? is that in trunk? if so: can you rebase the later patch on top of trunk & submit as a new PR
  • is there any way to create the streams with the statistics null and then call some operations, including toString? just to verify that this case is handled

@mehakmeet
Copy link
Contributor Author

this includes a patch gabor committed? is that in trunk? if so: can you rebase the later patch on top of trunk & submit as a new PR

Gabor's patch wasn't committed/merged. It was still in a PR, I am taking over the PR and finish it with adding some tests and using AbfsInputStreamContext class.

Copy link

@bgaborg bgaborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mehakmeet. Yes, you can take over my abandoned stat patch with this.
LGTM overall, @steveloughran requested some changes so +1 pending

@mehakmeet
Copy link
Contributor Author

mehakmeet commented Jun 22, 2020

I have added a test with statistics as null and tested out some operations and toString(), namely testWithNullStreamStatistics.
The test works fine when running from IDE, but gives an error while running from the terminal.

[ERROR] Tests run: 4, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 57.723 s <<< FAILURE! - in org.apache.hadoop.fs.azurebfs.ITestAbfsInputStreamStatistics
[ERROR] testWithNullStreamStatistics(org.apache.hadoop.fs.azurebfs.ITestAbfsInputStreamStatistics)  Time elapsed: 5.633 s  <<< ERROR!
Operation failed: "The specified path does not exist.", 404, HEAD, https://mehakmeetdata.dfs.core.windows.net/abfs-testcontainer-db275869-bc66-41a2-ab2e-7f3383462237/test/testWithNullStreamStatistics?upn=false&action=getStatus&timeout=90
	at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:192)
	at org.apache.hadoop.fs.azurebfs.services.AbfsClient.getPathStatus(AbfsClient.java:477)
	at org.apache.hadoop.fs.azurebfs.ITestAbfsInputStreamStatistics.testWithNullStreamStatistics(ITestAbfsInputStreamStatistics.java:261)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.lang.Thread.run(Thread.java:748)

The error seems to be in L267(in ITestAbfsInputStreamStatistics.java), when I do getFileStatus() from the AbfsClient. This throws the "The specified path does not exist." error.
I cannot figure out why. Any suggestions for a fix? @steveloughran @mukund-thakur @bgaborg

@steveloughran
Copy link
Contributor

could be settings in the FS instance as to whether hflush() writes the file...the failure is happening when it doesn't, and I recall it is a setting. in the ide you are probably getting a new FS Instance, but in the maven builds one already in the JVM is being recycled

@steveloughran
Copy link
Contributor

javadoc failures are from the java 11 builds; ignore for now

@mehakmeet
Copy link
Contributor Author

Using rawConfig.setBoolean(DISABLE_ABFS_CACHE_KEY, true); to create the AzureBlobFileSystem for the test isn't solving the issue as well. Might be some other problem.

@mukund-thakur
Copy link
Contributor

The test fails only when parallel tests are enabled.
mvn -T 1C clean verify -Dtest=none -Dit.test=ITestAbfsInputStreamStatistics#testWithNullStreamStatistics
-- Succeeds
mvn -T 1C -Dparallel-tests=abfs clean verify -Dtest=noneDit.test=ITestAbfsInputStreamStatistics#testWithNullStreamStatistics
-- Fails.

The reason is during parallel tests paths are created under fork directories but while getting the path status absolute path is used.
Fix is change :
AbfsRestOperation abfsRestOperation = fs.getAbfsClient().getPathStatus(
"/test/" + getMethodName(), false);
to
AbfsRestOperation abfsRestOperation = fs.getAbfsClient().getPathStatus(
nullStatFilePath.toUri().getPath(), false);

Interesting debugging indeed.

@mehakmeet
Copy link
Contributor Author

Thanks, @mukund-thakur. Brilliant debugging :)

@mehakmeet
Copy link
Contributor Author

I have changed the positioning of bytesReadFromBuffer, due to a bug found that this counter was actually incrementing ReadAhead buffer rather than local buffer(ReadAhead counters in a separate patch). Null statistics test works due to @mukund-thakur 's help.

@mehakmeet mehakmeet requested a review from steveloughran June 26, 2020 04:57
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 10s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 21m 28s trunk passed
+1 💚 compile 0m 32s trunk passed with JDK Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04
+1 💚 compile 0m 27s trunk passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+1 💚 checkstyle 0m 20s trunk passed
+1 💚 mvnsite 0m 31s trunk passed
+1 💚 shadedclient 16m 30s branch has no errors when building and testing our client artifacts.
-1 ❌ javadoc 0m 24s hadoop-azure in trunk failed with JDK Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04.
+1 💚 javadoc 0m 23s trunk passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+0 🆗 spotbugs 0m 51s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 0m 50s trunk passed
_ Patch Compile Tests _
+1 💚 mvninstall 0m 27s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04
+1 💚 javac 0m 26s the patch passed
+1 💚 compile 0m 23s the patch passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+1 💚 javac 0m 23s the patch passed
+1 💚 checkstyle 0m 14s the patch passed
+1 💚 mvnsite 0m 25s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedclient 15m 11s patch has no errors when building and testing our client artifacts.
-1 ❌ javadoc 0m 22s hadoop-azure in the patch failed with JDK Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04.
+1 💚 javadoc 0m 19s the patch passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+1 💚 findbugs 0m 54s the patch passed
_ Other Tests _
+1 💚 unit 1m 23s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 27s The patch does not generate ASF License warnings.
65m 8s
Subsystem Report/Notes
Docker ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-2076/3/artifact/out/Dockerfile
GITHUB PR #2076
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 68b7bc6f2e06 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 6a8fd73
Default Java Private Build-1.8.0_252-8u252-b09-1~18.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_252-8u252-b09-1~18.04-b09
javadoc https://builds.apache.org/job/hadoop-multibranch/job/PR-2076/3/artifact/out/branch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04.txt
javadoc https://builds.apache.org/job/hadoop-multibranch/job/PR-2076/3/artifact/out/patch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-2076/3/testReport/
Max. process+thread count 344 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-2076/3/console
versions git=2.17.1 maven=3.6.0 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor

LGTM.
+1 from me...merging to trunk

@steveloughran
Copy link
Contributor

I tried to cp into branch-3 and it didn't take. Is there some previous patch we need to merge in? I'd like both to stay 100% in sync here, so if there's something we need to pull in first, I'll do that

@mehakmeet
Copy link
Contributor Author

HADOOP-16852(#1898)(causing the conflict) and HADOOP-17065(#2056) also, needs to go in.

asfgit pushed a commit that referenced this pull request Jul 25, 2020
snvijaya pushed a commit to snvijaya/hadoop that referenced this pull request May 23, 2021
jojochuang pushed a commit to jojochuang/hadoop that referenced this pull request May 23, 2023
…he#2076)

Contributed by Mehakmeet Singh.
Contributed by Gabor Bota.

Change-Id: I0394a37a75d79c8706f1f925b9d010b950cf4ae3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants