Skip to content

HADOOP-13327 Output Stream Specification. #1694

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

steveloughran
Copy link
Contributor

This PR removes the changes related to S3A output stream lifecycle,
so only covers the specification of Syncable and ensures that StreamCapabilities
passes all the way through to the final implementation classes.

All streams which implement Syncable hsync/hflush declare this in their stream capabilities

Supercedes #575

Change-Id: I82b16a8e0965f34eb0c42504da43e8fbeabcb68c

@steveloughran steveloughran force-pushed the filesystem/HADOOP-13327-outputstream-spec-lean branch from 1431540 to 5237cd1 Compare November 5, 2019 17:17
### HDFS and `OutputStream.close()`

HDFS does not immediately `sync()` the output of a written file to disk on
`OutputStream.close()` unless configured with `dfs.datanode.synconclose`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:end of line

```

HDFS does not do this except when the write crosses a block boundary; to do
otherwise would overload the Namenode. Other stores MAY copy this behavior.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:end of line

When an output stream in HDFS is closed; the newly written data is not immediately
written to disk unless HDFS is deployed with `dfs.datanode.synconclose` set to
true. Otherwise it is cached and written to disk later.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:end of line

### `StreamCapabilities`

Implementors of filesystem clients SHOULD implement the `StreamCapabilities`
interface and its `hasCapabilities()` method to to declare whether or not

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:end of line

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 83 Docker mode activated.
_ Prechecks _
+1 dupname 1 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
0 mvndep 24 Maven dependency ordering for branch
+1 mvninstall 1210 trunk passed
+1 compile 1082 trunk passed
+1 checkstyle 165 trunk passed
+1 mvnsite 259 trunk passed
+1 shadedclient 1248 branch has no errors when building and testing our client artifacts.
+1 javadoc 258 trunk passed
0 spotbugs 50 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 433 trunk passed
_ Patch Compile Tests _
0 mvndep 28 Maven dependency ordering for patch
+1 mvninstall 174 the patch passed
+1 compile 1053 the patch passed
-1 javac 1053 root generated 2 new + 1892 unchanged - 0 fixed = 1894 total (was 1892)
-0 checkstyle 164 root: The patch generated 1 new + 77 unchanged - 3 fixed = 78 total (was 80)
+1 mvnsite 258 the patch passed
-1 whitespace 0 The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 xml 3 The patch has no ill-formed XML file.
+1 shadedclient 893 patch has no errors when building and testing our client artifacts.
+1 javadoc 277 the patch passed
+1 findbugs 508 the patch passed
_ Other Tests _
-1 unit 569 hadoop-common in the patch failed.
-1 unit 7719 hadoop-hdfs in the patch failed.
+1 unit 106 hadoop-azure in the patch passed.
+1 unit 77 hadoop-azure-datalake in the patch passed.
+1 asflicense 63 The patch does not generate ASF License warnings.
16557
Reason Tests
Failed junit tests hadoop.util.TestReadWriteDiskValidator
hadoop.conf.TestCommonConfigurationFields
hadoop.hdfs.server.namenode.TestFSImage
hadoop.hdfs.server.balancer.TestBalancerRPCDelay
hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks
Subsystem Report/Notes
Docker Client=19.03.4 Server=19.03.4 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1694/2/artifact/out/Dockerfile
GITHUB PR #1694
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml
uname Linux c5adbb815350 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / bfb8f28
Default Java 1.8.0_222
javac https://builds.apache.org/job/hadoop-multibranch/job/PR-1694/2/artifact/out/diff-compile-javac-root.txt
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-1694/2/artifact/out/diff-checkstyle-root.txt
whitespace https://builds.apache.org/job/hadoop-multibranch/job/PR-1694/2/artifact/out/whitespace-eol.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-1694/2/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-1694/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1694/2/testReport/
Max. process+thread count 3874 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-tools/hadoop-azure hadoop-tools/hadoop-azure-datalake U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1694/2/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.


/**
* Probe for an object having a capability; returns true
* iff the stream implements {@link StreamCapabilities} and its
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo? there are several iff in this doc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"if and only if"; CS term. I think I mention it somewhere -if not I should. It's used to be absolutely clear that if the condition is not met then the outcome is blocked

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can cut it from the javadocs & leave in the markdown files, as they are the stricter bits of work

This PR removes the changes related to S3A output stream lifecycle,
so only covers the specification of Syncable and ensures that StreamCapabilities
passes all the way through to the final implementation classes.

All streams which implement Syncable hsync/hflush declare this in their stream capabilities

Change-Id: I82b16a8e0965f34eb0c42504da43e8fbeabcb68c
Change-Id: Id38cf27639215abdd0d8c3578ddf72ed7adca8c5
TODO "Could this be in a section about visibility and not in the model definition? Maybe later. "here's the model, here's how that model works with creation, here's how it works when reading/writing" flows much better and visibility should be in that third part."

Change-Id: I61c89475a1ea72006524803f2a7dd9e40551d718
Review with more on 404 caching.

Change-Id: Ib474a84e48556c6b76121427a026fa854b5bd9e0
@steveloughran steveloughran force-pushed the filesystem/HADOOP-13327-outputstream-spec-lean branch from 5237cd1 to c17cf2e Compare January 27, 2020 12:14
@steveloughran
Copy link
Contributor Author

closing to rebase and resubmit

@steveloughran steveloughran deleted the filesystem/HADOOP-13327-outputstream-spec-lean branch June 26, 2020 16:18
@steveloughran
Copy link
Contributor Author

rebased to #2102

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants