Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1088,7 +1088,7 @@ private Constants() {
* Default retention policy: {@value}.
*/
public static final String DEFAULT_DIRECTORY_MARKER_POLICY =
DIRECTORY_MARKER_POLICY_DELETE;
DIRECTORY_MARKER_POLICY_KEEP;


/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -186,11 +186,11 @@ public static DirectoryPolicy getDirectoryPolicy(
policy = DELETE;
break;
case DIRECTORY_MARKER_POLICY_KEEP:
LOG.info("Directory markers will be kept");
LOG.debug("Directory markers will be kept");
policy = KEEP;
break;
case DIRECTORY_MARKER_POLICY_AUTHORITATIVE:
LOG.info("Directory markers will be kept on authoritative"
LOG.debug("Directory markers will be kept on authoritative"
+ " paths");
policy = new DirectoryPolicyImpl(MarkerPolicy.Authoritative,
authoritativeness);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -792,7 +792,7 @@ Security
Delegation token support is disabled

Directory Markers
The directory marker policy is "delete"
The directory marker policy is "keep"
Available Policies: delete, keep, authoritative
Authoritative paths: fs.s3a.authoritative.path=```
```
Expand Down

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,10 @@

### <a name="directory-marker-compatibility"></a> Directory Marker Compatibility

1. This release can safely list/index/read S3 buckets where "empty directory"
markers are retained.

1. This release can be configured to retain these directory makers at the
expense of being backwards incompatible.
This release does not delete directory markers when creating
files or directories underneath.
This is incompatible with versions of the Hadoop S3A client released
before 2021.

Consult [Controlling the S3A Directory Marker Behavior](directory_markers.html) for
full details.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -119,14 +119,7 @@ Without S3Guard, listing performance may be slower. However, Hadoop 3.3.0+ has s
improved listing performance ([HADOOP-17400](https://issues.apache.org/jira/browse/HADOOP-17400)
_Optimize S3A for maximum performance in directory listings_) so this should not be apparent.

We recommend disabling [directory marker deletion](directory_markers.html) to reduce
the number of DELETE operations made when writing files.
this reduces the load on the S3 partition and so the risk of throttling, which can
impact performance.
This is very important when working with versioned S3 buckets, as the tombstone markers
created will slow down subsequent listing operations.

Finally, the S3A [auditing](auditing.html) feature adds information to the S3 server logs
The S3A [auditing](auditing.html) feature adds information to the S3 server logs
about which jobs, users and filesystem operations have been making S3 requests.
This auditing information can be used to identify opportunities to reduce load.

Expand Down Expand Up @@ -162,7 +155,6 @@ Example
```bash
> hadoop s3guard bucket-info -magic -markers keep s3a://test-london/

2021-11-22 15:21:00,289 [main] INFO impl.DirectoryPolicyImpl (DirectoryPolicyImpl.java:getDirectoryPolicy(189)) - Directory markers will be kept
Filesystem s3a://test-london
Location: eu-west-2

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -339,16 +339,19 @@ Hadoop supports [different policies for directory marker retention](directory_ma
-essentially the classic "delete" and the higher-performance "keep" options; "authoritative"
is just "keep" restricted to a part of the bucket.

Example: test with `markers=delete`

Example: test with `markers=keep`

```
mvn verify -Dparallel-tests -DtestsThreadCount=4 -Dmarkers=delete
mvn verify -Dparallel-tests -DtestsThreadCount=4 -Dmarkers=keep
```

Example: test with `markers=keep`
This is the default and does not need to be explicitly set.

Example: test with `markers=delete`

```
mvn verify -Dparallel-tests -DtestsThreadCount=4 -Dmarkers=keep
mvn verify -Dparallel-tests -DtestsThreadCount=4 -Dmarkers=delete
```

Example: test with `markers=authoritative`
Expand Down Expand Up @@ -1268,33 +1271,6 @@ bin/hdfs fetchdt -print secrets.bin
# expect warning "No TokenRenewer defined for token kind S3ADelegationToken/Session"
bin/hdfs fetchdt -renew secrets.bin

# ---------------------------------------------------
# Directory markers
# ---------------------------------------------------

# require success
bin/hadoop s3guard bucket-info -markers aware $BUCKET
# expect failure unless bucket policy is keep
bin/hadoop s3guard bucket-info -markers keep $BUCKET/path

# you may need to set this on a per-bucket basis if you have already been
# playing with options
bin/hadoop s3guard -D fs.s3a.directory.marker.retention=keep bucket-info -markers keep $BUCKET/path
bin/hadoop s3guard -D fs.s3a.bucket.$BUCKETNAME.directory.marker.retention=keep bucket-info -markers keep $BUCKET/path

# expect to see "Directory markers will be kept" messages and status code of "46"
bin/hadoop fs -D fs.s3a.bucket.$BUCKETNAME.directory.marker.retention=keep -mkdir $BUCKET/p1
bin/hadoop fs -D fs.s3a.bucket.$BUCKETNAME.directory.marker.retention=keep -mkdir $BUCKET/p1/p2
bin/hadoop fs -D fs.s3a.bucket.$BUCKETNAME.directory.marker.retention=keep -touchz $BUCKET/p1/p2/file

# expect failure as markers will be found for /p1/ and /p1/p2/
bin/hadoop s3guard markers -audit -verbose $BUCKET

# clean will remove markers
bin/hadoop s3guard markers -clean -verbose $BUCKET

# expect success and exit code of 0
bin/hadoop s3guard markers -audit -verbose $BUCKET

# ---------------------------------------------------
# Copy to from local
Expand Down