@@ -172,10 +172,13 @@ in `_$folder$` was considered to be a sign that a directory existed. A call to
172172The S3A also has directory markers, but it just appends a "/" to the directory
173173name, so ` mkdir(s3a://bucket/a/b) ` will create a new marker object ` a/b/ ` .
174174
175- When a file is created under a path, the directory marker is deleted. And when a
176- file is deleted, if it was the last file in the directory, the marker is
175+ In older versions of Hadoop, when a file was created under a path,
176+ the directory marker is deleted. And when a file is deleted,
177+ if it was the last file in the directory, the marker is
177178recreated.
178179
180+ This release does not delete directory markers.
181+
179182And, historically, when a path is listed, if a marker to that path is found, * it
180183has been interpreted as an empty directory.*
181184
@@ -247,8 +250,6 @@ directory markers when creating files under paths. This removes all scalability
247250problems caused by deleting these markers -however, it is achieved at the expense
248251of backwards compatibility.
249252
250- ## <a name =" marker-retention " ></a > Controlling marker retention with ` fs.s3a.directory.marker.retention `
251-
252253There is now an option ` fs.s3a.directory.marker.retention ` which controls how
253254markers are managed when new files are created
254255
@@ -264,32 +265,15 @@ The setting, `fs.s3a.directory.marker.retention = delete` is compatible with
264265every shipping Hadoop release; that of ` keep ` compatible with
265266all releases since 2021.
266267
267- ## <a name =" s3guard " ></a > Directory Markers and Authoritative paths
268-
269-
270- The now-deleted S3Guard feature included the concept of "authoritative paths";
271- paths where all clients were required to be using S3Guard and sharing the
272- same metadata store.
273- In such a setup, listing authoritative paths would skip all queries of the S3
274- store -potentially being much faster.
268+ ### Hadoop 3.4.0: markers are not deleted by default
275269
276- In production, authoritative paths were usually only ever for Hive managed
277- tables, where access was strictly restricted to the Hive services.
270+ [ HADOOP-18752] ( https://issues.apache.org/jira/browse/HADOOP-18752 )
271+ _ Change fs.s3a.directory.marker.retention to "keep"_ changed the default
272+ policy.
278273
274+ Marker deletion can still be enabled.
279275
280- When the S3A client is configured to treat some directories as "Authoritative"
281- then an S3A connector with a retention policy of ` fs.s3a.directory.marker.retention ` of
282- ` authoritative ` will omit deleting markers in authoritative directories.
283-
284- ``` xml
285- <property >
286- <name >fs.s3a.bucket.hive.authoritative.path</name >
287- <value >/tables</value >
288- </property >
289- ```
290- This an option to consider if not 100% confident that all
291- applications interacting with a store are using an S3A client
292- which is marker aware.
276+ ### Hadoop 3.5.x: marker deletion is no longer supported.
293277
294278## <a name =" bucket-info " ></a > Verifying marker policy with ` s3guard bucket-info `
295279
@@ -306,7 +290,6 @@ line of bucket policies via the `-marker` option
306290
307291All releases of Hadoop which have been updated to be marker aware will support the ` -markers aware ` option.
308292
309-
3102931 . Updated releases which do not support switching marker retention policy will also support the
311294` -markers delete ` option.
312295
0 commit comments